When we compare our model accuracy on our training and testing data, a large difference is good indicator that our model might be overfitting. But how large must this difference be? Is there any rule of thumb for this? What difference do you consider alarming (and why)?
I read a few discussions on this, such as Train vs Test Error Gap and its relationship to Overfitting : Reconciling conflicting advice and How can training and testing error comparisons be indicative of overfitting?, but none of them gave me the answer.