2

An object is tracked in an experiment I ran. The object's velocities over a 100 time steps are recorded. A model for this object says the velocities should be Rayleigh distributed.

Question 1: For the given data, how can one estimate the parameters of the Rayleigh distribution.

Question 2: After estimating the parameters, how can we calculate the maximum percentage difference between the observed data and estimated model?

The data:

4.1278
3
4.3681
3.4993
3.5099
3.4569
3.7245
3.1744
2.7477
4.1866
3.1257
4.6310
4.4171
3.8297
3.9558
4.4083
4.2123
4.4593
4.3426
3.7140
3.2105
2.5694
2.0113
1.9838
2.7958
2.4362
2.4558
1.0497
0.5789
0.4487
1.4811
1.5465
1.2266
1.628
1.9165
1.3497
2.4598
1.706
2.3716
2.3808
3.9231
2.2745
1.5667
1.7846
2.5792
1.2635
1.7394
1.9063
2.1777
2.8
1.8492
3.7035
2.7328
3.2436
2.7707
2.4044
2.5395
2.7072
2.0354
3.6457
2.8088
2.5456
2.4235
1.6
2.4023
2.9129
4.6793
3.7958
2.8100
1.9567
1.649
1.3688
1.1426
1.2340
0.37162
0.81565
1.5374
1.1495
1.6362
2.0497
1.6581
0.32883
0.4272
0.89617
1.4784
1.2486
1.783
3.52
4.486
3.1934
3.2815
1.2387
1.5428
1.7167
1.4661
1.6412
3.4185
2.3761
3.2665
3.6833

A hack that I tried was to find the mean and standard deviation assuming our data was normal. Knowing scale is not the same thing as standard deviation squared, I used the standard deviation squared as "scale".

Then, I ran the K-S test with two samples: (1) observed data, and (2) the expected values of a Rayleigh distribution with mean and scale (incorrectly as standard deviation) to find the D-max. However, while the D-max is acceptable, the p-values is low. So, I hope that you all can help me find a statistically robust method to find the scale.

Now, it is likely that my data will result in a good fit. However, even if it's just out of academic interest, can we try and solve this problem anyway?

Not critical, but if any of you have an approach to answer the aforementioned questions in Python, that would be of interest to me as well.

P.S.: Just to be sure I wasn't violating any rule, I went through the overview of the site , and it seems like my question is within the rules of the site. However, if there is something wrong, please assist me by making appropriate suggestions.

  • For Question 2, is the QQ-plot a valid approach? – troisquatre Jun 30 '18 at 10:02
  • 3
  • The question begins in a rather text-book-exercise-like fashion. Is this for a class? An assignment, perhaps? 2. If you want to add to your question, edit your question, but you should explain both what kind of QQ plot you're referring to and how you plan to use it to answer the question. 3. Note that there are many possible ways to estimate parameters. Is there some kind of estimator you have in mind? 4. The Rayleigh distribution probably won't be a very good fit to these data.
  • – Glen_b Jul 01 '18 at 08:09
  • @Glen_b : Thanks for your reply.

    A2: For data that's not normally distributed, there aren't different kinds of QQ-plots, right ( ref1 , ref2 ) ? So, if we want to examine how close our data comes to a Rayleigh distribution we would simply use the QQ-plots, correct?

    – troisquatre Jul 01 '18 at 11:03
  • It wasn't clear that you were plotting against Rayleigh scores; for all I know you could have been plotting against normal scores (we've had questions where people have done things like that), or you might have been doing something else, like a QQ plot of a pair of samples; it's best to make sure exactly what you did and how. 2. Note that the tour doesn't cover what's on topic, for example. 3. Note that the KS test isn't suitable if you have estimated parameters. 4. Would you please address the remaining questions above?
  • – Glen_b Jul 01 '18 at 11:40
  • @Glen_b , thank you for your input about the KS test and the site. I should have addressed everything you requested for, but if something remains, please let me know. To clarify: the data comes from an experiment I conducted where I track a moving object. – troisquatre Jul 01 '18 at 12:13
  • 1 Thanks; I also asked if you had a particular kind of estimator in mind (there are many reasonable estimators; though I'd probably go with MLE unless there's likely to be some contamination). 2. To further clarify on the QQ plots thing -- your links were mostly to normal QQ plots, but that's not suitable for assessing whether you have Rayleigh data. You'd want a QQ plot specifically designed for data from a Rayleigh distribution. (The answer shows one way to do this though) – Glen_b Jul 01 '18 at 13:18
  • @Glen_b, I wasn't able to confidently choose an appropriate estimator for a $\chi_k$ distribution. For instance, why is MLE the best option? Please note, I added a "meta style" comment, feel free to delete it if needed. – troisquatre Jul 01 '18 at 13:36
  • A Rayleigh distribution looks like a poor fit--there are too many small values in it. Perhaps a better model would be that the observations are made with a certain amount of error, whence their distribution would reflect a sum of a Rayleigh variable and the error. – whuber Jul 02 '18 at 14:08