My understanding of the Central Limit Theorem is that the distribution of the sample mean will tend towards normal, for random sampling with replacement.
How reasonable is it for me to infer that if I have a bunch of samples but don't know if the underlying population is the same, a high skewness value is (perhaps circumstantial) evidence that the underlying population is in fact different (or the sampling wasn't random, etc)?
In my specific case, I am dealing with a problem where I have data samples from around 1000 different entities regarding compliance. Say it is wastewater samples with various amounts of lead, and there are ~1500 samples per entity. I have plotted the mean lead levels from each entity, and the distribution is noticeably right-skewed (R's 'moments' library gives a skewness of 0.36). The entities will claim that any high sample means are just due to the vagaries of the lead-pollution business, but does the skewness/lack of normality of data constitute (perhaps weak or circumstantial) evidence that in fact the underlying population is not identical, and some locations are systematically (rather than randomly) worse than others? My intuition was that if the lead level samples were normally distributed, it would fit with the claim that some entities are just unlucky (and in practice, be an argument against investing extra resources in investigating those entities compared with the other ones). But I feel like intuition is not a very reliable guide in the jungles of statistical parameter interpretation!

