What is the difference between 1x1 convolutions and convolutions with "SAME" padding?

Question

In general, 1x1 convolutions are used to reduce the dimensionality of filter space. I referred this answer.

But we can also reduce the dimensionality of filter space (number of filters) using convolutions other than 1x1 using padding "SAME".

So, What is the difference between those two approaches? One difference I could figure out from the above mentioned question is that applying higher convolutions are computationally expensive and hence first 1x1 convolutions are used to reduce the number of filters and then higher size convolutions are applied. Reference: This answer.

Is there any other significant difference between those two?

'same' padding can be used with any conv filter size.... incl 1x1. — shimao, Oct 02 '19 at 12:55
you don't need padding with 1x1 convolution, since 1x1 convolution doesn't change the shape of the input tensor. padding isn't used to reduce number of filters, it just adds zeros to the border of each patch so that when you apply convolution you will end up with the same height and width (for images). without padding each convolutional operation will make height and width of your image slightly smaller (for example, 3x3 will remove two pixels per each row and column) — itdxer, Oct 02 '19 at 13:24
Yes padding does not change the number of filters but using Same padding and desired number of filters you can get desired dimensions without using 1x1 convolutions. So my question is why 1x1 convolutions? — Kaushal28, Oct 02 '19 at 13:30

score 1 · Answer 1 · answered Oct 02 '19 at 13:29

Yes, you can use any size of filters to reduce the dimensionality.

The main idea of using 1x1 conv. is that we do not consider capturing the spatial information "here", but leave the task for "following layers". Based on this idea, using conv layers with larger filter size becomes redundant and wastes the parameters.

However, you might be wondering what will happen if you use conv layers with larger filter size to reduce the dimensionality. I would say, in most of cases, it won't hurt and probably will give you some (or minor) benefits (larger receptive fields, more non-linear operations). But, we should keep in mind that there is always a trade-off between accuracy and computation cost, especially when the improvement is limited.

What is the difference between 1x1 convolutions and convolutions with "SAME" padding?

1 Answers1