1

I have a dataset with 983 rows and two columns: Rent and Postcode.

The mean Rent for the entire sample (983 rows) is 817.49. The mean Rent for the N1 postcode (23 rows) is 887.02.

The N1 data is a subset of my entire sample; is it still possible to compare the means? How would I test whether Rent in N1 is significantly higher than average?

The statistical tests I have come across so far either rely on independent samples (I assume these are not independent as one is a subset of another), or dependent samples measured across time (which these are not).

  • 2
    You can compare them, iff you separate the sets beforehand, then N1 becomes a subgroup, and all the rest are your average.. – user2974951 Oct 17 '18 at 10:47

1 Answers1

1

You need the average separately for the two groups (N1 and Not N1), and from the information you posted the mean for Not N1 can be calculated as $$ \frac{817.4 \cdot 983 - 887.02 \cdot 23}{983-23}. $$ Then you can use the independent samples t-test (or some other test for independent samples).