In my experience, choosing batch_size = 1 gives the best result and choosing the batch_size = whole data number gives the worst. And seems there is a linear or exponential relation between these two numbers to choose(I mean choosing a number nearer to 1 increase and choosing the a number next to number of data decreases the performance.
So, can we say batch_size = 1 is the best choice, if you have no limitation of computational system(you have a powerful computer I mean)?