What is the difference between "SAME" and "VALID" padding in
In my opinion, "VALID" means there will be no zero padding outside the edges when we do max pool.
According to A guide to convolution arithmetic for deep learning, it says that there will be no padding in pool operator, i.e. just use "VALID" of
But what is "SAME" padding of max pool in
If you like ascii art:
"VALID" = without padding:
inputs: 1 2 3 4 5 6 7 8 9 10 11 (12 13) |________________| dropped |_________________|
"SAME" = with zero padding:
pad| |pad inputs: 0 |1 2 3 4 5 6 7 8 9 10 11 12 13|0 0 |________________| |_________________| |________________|
In this example:
"VALID"only ever drops the right-most columns (or bottom-most rows).
"SAME"tries to pad evenly left and right, but if the amount of columns to be added is odd, it will add the extra column to the right, as is the case in this example (the same logic applies vertically: there may be an extra row of zeros at the bottom).
About the name:
"SAME"padding, if you use a stride of 1, the layer"s outputs will have the same spatial dimensions as its inputs.
"VALID"padding, there"s no "made-up" padding inputs. The layer only uses valid input data.
stride is 1 (more typical with convolution than pooling), we can think of the following distinction:
"SAME": output size is the same as input size. This requires the filter window to slip outside input map, hence the need to pad.
"VALID": Filter window stays at valid position inside input map, so output size shrinks by
filter_size - 1. No padding occurs.
I"ll give an example to make it clearer:
x: input image of shape [2, 3], 1 channel
valid_pad: max pool with 2x2 kernel, stride 2 and VALID padding.
same_pad: max pool with 2x2 kernel, stride 2 and SAME padding (this is the classic way to go)
The output shapes are:
valid_pad: here, no padding so the output shape is [1, 1]
same_pad: here, we pad the image to the shape [2, 4] (with
-infand then apply max pool), so the output shape is [1, 2]
x = tf.constant([[1., 2., 3.], [4., 5., 6.]]) x = tf.reshape(x, [1, 2, 3, 1]) # give a shape accepted by tf.nn.max_pool valid_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding="VALID") same_pad = tf.nn.max_pool(x, [1, 2, 2, 1], [1, 2, 2, 1], padding="SAME") valid_pad.get_shape() == [1, 1, 1, 1] # valid_pad is [5.] same_pad.get_shape() == [1, 1, 2, 1] # same_pad is [5., 6.]
The TensorFlow Convolution example gives an overview about the difference between
SAME padding, the output height and width are computed as:
out_height = ceil(float(in_height) / float(strides)) out_width = ceil(float(in_width) / float(strides))
VALID padding, the output height and width are computed as:
out_height = ceil(float(in_height - filter_height + 1) / float(strides)) out_width = ceil(float(in_width - filter_width + 1) / float(strides))
Padding is an operation to increase the size of the input data. In case of 1-dimensional data you just append/prepend the array with a constant, in 2-dim you surround matrix with these constants. In n-dim you surround your n-dim hypercube with the constant. In most of the cases this constant is zero and it is called zero-padding.
You can use arbitrary padding for your kernel but some of the padding values are used more frequently than others they are:
k, this padding is equal to
k - 1.
To use arbitrary padding in TF, you can use