I am running a fixed effects regression with a panel dataset of roughly 7M rows. The objective is to measure the effect of some event on the outcome variable, Y, which represents the amount spent on a good, say alcohol, in a given week. I am controlling for Person and Week fixed effects and clustering standard errors also by Person and Week. The dataset is at the person-week level, and I am regressing Y against a set of 8 dummies which are equal to 1 if a given week is N weeks pre- or post-event.
The sample contains people that purchased the good once or more than once as well as people that never purchased it. The vast majority of the people in the sample, roughly ~600K, never purchased the good, while only about 1% of the people in the sample, roughly 6K, purchased it at least once over the period of interest.
Now, to make the panel slightly more balanced, I decided to drop 200K randomly selected people among those that never purchased the good. This way, the share of people that purchased the good at least once in the period rose to about 1.5%.
However, when I re-ran the regression, the standard errors were much, much larger than the ones in the original model (with the full sample). I am a bit puzzled because the sample is still pretty large, since it went from ~7M rows to about ~5M and other than the fact that the ratio of those that purchased the good to those that never did is slightly higher, the structure of the data was not altered.
Can anyone explain what is going on?
Any help would be much appreciated. Thanks!