I have data on every species in a genus, and am interested in how two factors relate to each other specifically within that genus. Should I use a statistical test to do this or does it not make sense? I'm thinking it doesn't make sense, because the point of a (NHST) statistical test is to see how confident you can be that the pattern in your sample isn't due to chance, but I have the full population so I know the true values. Is this correct or should I be running a test for other reasons?
-
It rather depends on what data you are talking about. Take for example the Brassica genus (cabbages and mustards): one of the species is Brassica oleracea and if you had an example of that species then in many respects it might not be the same as other examples from the same species (cultivars include cabbage, broccoli, cauliflower, kale, Brussels sprouts, collard greens, Savoy cabbage, kohlrabi, and gai lan). – Henry Apr 18 '23 at 15:44
-
You do not have a population, full stop. – whuber Apr 18 '23 at 15:44
-
I think I will ask this again, but in a better way/give more details. – Picapica May 09 '23 at 09:45
1 Answers
The point of doing statistical inference of any kind, be it hypothesis testing, creating confidence intervals, or using Bayesian methods, is to use the available data (the known) to infer something about a greater population from which the data are drawn (unknown) and quantify the uncertainty in your inference since you are dealing with the unknown.
If you only have the known, you do not have to infer anything about the unknown, and hypothesis testing is superfluous.
That said, it would be unusual to have the entire population under study. Indeed, the full “population” is typically a data-generating process (DGP) that is infinite. It might be that you really do have the entire population you want to study, but even if you do, it is possible that you really want to know about the DGP, and you can use the data to draw inferences about the DGP.
- 62,186