I have a data set of continuous data. From 1000 observations, the resulting dataset is only 29 rows and two columns: the range (starting point) and the count.
Pearson - Weldon crabs dataset
| range | count |
|---|---|
| 0.5835 | 1 |
| 0.5875 | 3 |
| 0.5915 | 5 |
| 0.5955 | 2 |
| 0.5995 | 7 |
| 0.6035 | 10 |
| 0.6075 | 13 |
| 0.6115 | 19 |
| 0.6155 | 20 |
| 0.6195 | 25 |
| 0.6235 | 40 |
| 0.6275 | 31 |
| 0.6315 | 60 |
| 0.6355 | 62 |
| 0.6395 | 54 |
| 0.6435 | 74 |
| 0.6475 | 84 |
| 0.6515 | 86 |
| 0.6555 | 96 |
| 0.6595 | 85 |
| 0.6635 | 75 |
| 0.6675 | 47 |
| 0.6715 | 43 |
| 0.6755 | 24 |
| 0.6795 | 19 |
| 0.6835 | 9 |
| 0.6875 | 5 |
| 0.6915 | 0 |
| 0.6955 | 1 |
tot count = 1000 rows = 30
What test is best appropriate?
- Pearson Chi-Square
- Shapiro-Wilk
nlminstead ofoptimbecause there are two parameters to estimate. You also need to take care with any low-count bins: they can screw up the chi-squared approximation. Handle this by usingchisq.testwith itssimulate.p.valueoption to perform this test once you have obtained the MLE and used that to compute the estimated bin counts. – whuber Jun 23 '22 at 20:57https://rdrr.io/cran/MixtureInf/man/pearson.html
Now I edit the question and put the real data
– frhack Jun 23 '22 at 21:00https://rpubs.com/frapas/911603
– frhack Jun 28 '22 at 19:31