I have data that is $\mathcal{NB}$-distributed. I have only ~100 data points in one sample. I have tried to fit Negative Binomial to the data, but at first I decided to do simulations:
library(MASS)
vect <- rnbinom(100, size=x, mu=y)
fitdistr(vect)
The accuracy of estimation of $mu$ parameter is quite high, but $size$ is estimated ±50 (I can provide the exact plot).
What is better to do: to use NB with wrongly estimated $size$ (it does not influence the shape of distribution a lot) or use sqrt/log/Box-Cox and normal distribution assumption? I can windsorize normal distribution.
The goal: assume, that we have 100 sets, each element is a vector of $n$ observations, each observation is a RV ~ NB. I would like to train on this 100 sets and then classify new elements as outliers from population or normal.