Knowing that computing an FFT is faster if the amount of samples is a power of 2 I have always tried to pad the inputs to Matlab's FFT with zeros until the next power of 2 is achieved. Matlab's function fftfilt which implements a FIR filter with the overlap-add method does the same, selecting the FFT size as 2^nextpow2(FIR_coeff + input_signal-1).
However I realize that if the amount to be padded is too long compared to the original size this my actually be inefficient. For instance if a 130000 samples FFT was to be computed. Assuming a huge and unpractical single FFT was carried out. As the next power of 2 of 130000 is 131072 maybe despite adding 1072 zeros it would be more efficient than performing the 130000 samples FFT directly. The worst case would be having (power of 2) + 1 samples and trying to pad to the next power of 2, specially if the number of samples is large. For instance in my example having 131073 samples which next power of 2 is 262144 would add 131071 zeros! Starting at which point in terms of ratios between original size and padding size would it be better not to pad? I'm looking for a practical and approximate idea. Let's say something like: do not pad if you are adding more than 5% of original size in zeros.