-1

I have 25 integers and I would like to find how many standard deviations each of them is from the mean. Apparently, normal distribution is not applicable here and therefore I have to move on with t-distribution. My problem is that I am not sure how to apply t-distribution. So, let's say that I have already computed t and s according to this link. What's the next step?

EDIT

here is a sample row:

array([  1.,   0.,   0.,   4.,   0.,   1.,   0.,   0.,   0.,   0.,   3.,
         0.,   2.,   1.,   0.,   0.,   3.,   0.,   3.,   0.,  14.,   0.,
         2.,   0.,   4.])

normally, values go from 0 to say 20. When I see unusual high numbers, I use to filter out the whole row. FYI, the following histogram shows how the really distribution looks like:

enter image description here

user2295350
  • 421
  • 4
  • 10
  • 19

2 Answers2

2

The question is puzzling.

(value $-$ mean) / SD is a descriptive calculation possible so long as SD is positive, which means usually. It doesn't require that data follow any specific probability distribution.

If you want to attach a probability, cumulative or otherwise, to it, then that will require a distribution assumption, but the possibilities aren't limited to normal or $t$ distributions. In fact, as your data are integers (and nothing more said) some quite different distribution may be more appropriate.

Nick Cox
  • 56,404
  • 8
  • 127
  • 185
  • Hi, thank you very much for your reply. Can you please evolve a little bit more? What other sort of distributions do you believe I could apply? I have already tried this `>>> import scipy.stats as st

    st.mstats.zscore(arr, axis=1)`, but unfortunately the results I am getting are a bit skewed on the positive side. Any ideas?

    – user2295350 Feb 09 '15 at 16:19
  • You show us your data and I will try to add more. At present the flavour of "please tell me how to work with data I am not showing you" allows only advice like "choose an appropriate distribution". More generally, constraints on values such as possible minimum and maximum are crucial to advice here. – Nick Cox Feb 09 '15 at 16:23
  • hi, I updated my original question. Please have a look :) – user2295350 Feb 09 '15 at 16:44
  • Your question now looks completely different. It seems that you know what the underlying distribution is, or should be (?), and that you are drawing one or more samples of size 25 (?). I don't think I want to add anything, as what you want is becoming increasingly unclear to me. I don't see how the t distribution enters at all. – Nick Cox Feb 09 '15 at 17:00
0

t itself is the number of standard deviations a given value x lies away from the mean. The sign of t tells you in which direction x lies away from the mean.

Also look at its Wikipedia entry.

Ayalew A.
  • 635