1

The score difference is the winning score minus the losing score. I'm using data from 1230 NBA games this season.

My thinking is that the winning and losing score are two independent random variables, since if one team gets 100 points, the other team's chance of getting 100,101,102... points doesn't really change. So since they are two RV (and their distribution, the distribution of the winning/losing score looks fairly normal) I expected the distribution of the score difference to be normal as well. What I got isn't normal, however, and I'm a bit confused as to why it's like that[Distribution of the winning score]Distribution of the 1

[enter image descripasdasdtion here]Distribution of the losing score2

enter image description here

EDIT: so I thought about it a bit, is because the game difference can only be positive, thus I'm missing the negative half of the distribution of score difference? The last graph does look like the right/positive half of a normal distribution.

Long Vuong
  • 125
  • 4
  • Try subtracting the away team score from the home team score, as you suggested in your edit. – Todd D Apr 15 '18 at 01:35
  • 2
    You're not looking at score of team A minus score of team B here but the higher score (by whichever team) minus the lower score (by whichever team). Rather than being the distribution of $Y-X$ it's the distribution of $\max(X,Y) - \min(X,Y)$ you're looking at, or equivalently, $|Y-X|$. The distribution of $Y-X$ might look reasonably close to normal (it obviously isn't, but it can look close) -- but its absolute value typically won't look close to normal; it will be right skew. – Glen_b Apr 16 '18 at 07:15

1 Answers1

2

It is true that the sum of normal variables is itself normal, but that is only true if the two are independent. Your variables are not independent - if we know that the losing team scored $X$, then we have gained a lot of information about the winning team's score; specifically that it must be more than $X$. Similarly, if we know the winning's team score, we have gained a lot of information about what the losing team's score is not likely to be. This lack of independence means that the resulting distribution is definitely not normal, and will never be negative.

  • Ooooooh ok. Looking like half the normal distribution is just a coincidence? – Long Vuong Apr 15 '18 at 01:57
  • 1
    Most likely - it's not clear what distribution your variable has, but it's probably not a truncated normal distribution. – Louis Cialdella Apr 15 '18 at 01:58
  • Both winning and losing scores will be correlated. This is because the pace of the game determines the total number of points scored. – zbicyclist Apr 15 '18 at 15:52
  • an extension... https://stats.stackexchange.com/questions/30159/is-it-possible-to-have-a-pair-of-gaussian-random-variables-for-which-the-joint-d for examples that the sum of normal variables need not be itself normal. – Sextus Empiricus Apr 16 '18 at 09:56