I've noticed that there are few sources that actually explain why variation between is close to variation within under null hypothesis. For simple ANOVA, the intuitive explanation is:
Variation within is simply an estimation of the population variance by averaging the variances of the individual groups (under the assumption that the variances of all groups are equal), by averaging the variances of individual groups we take the best shot at estimating the population variance.
As to variation between, it’s an interesting one. Under null hypothesis we assume that all groups come from the same population. Each group is a separate sample with its own mean. Under null hypothesis, the variance of the sample mean is a good estimator of the population variance (var(sample_mean) * sample_size), and if all the groups really come from the same population, this estimator of pop variance drawn from the sample means will tend to be close to population variance, otherwise, it will OVERESTIMATE the population variance, thus leading to a higher mean square between (which is an estimator of the population variance) and thus to higher F-statistic.
More on that here:
https://onlinestatbook.com/2/analysis_of_variance/one-way.html