0

Suppose, I have Dataframe with median home prices and number of homes sold in a market. I'd like to calculate a confidence metric around median home price based on # of home sales / transaction.

import pandas as pd

Create a list of data

data = [ ["Dallas", 1000, 250000], ["Austin", 2000, 300000], ["Texas", 3000, 350000], ]

Create a pandas DataFrame

df = pd.DataFrame(data, columns=["market", "number_of_homes_sold", "median_home_price"])

Print the DataFrame

print(df)

  • What are some appropriate confidence scores / metrics to calculate based on n or the sample size?
  • Would I calculate the confidence interval? If so, how?

I have the raw transactions-level data.

kll
  • 11
  • 2
  • Do you need a separate confidence interval for each state? – Ute Aug 21 '23 at 21:08
  • Several methods to find confidence intervals for the median from the raw data are discussed for example here: https://stats.stackexchange.com/q/21103/237561 – Ute Aug 21 '23 at 21:18
  • Or are you rather trying to establish a relation between number of houses sold and median price? – Ute Aug 21 '23 at 21:22
  • "# of home sales / transaction" is a confusing metric. Isn't there exactly one home sale per transaction? Or are you studying bulk transactions in some kind of wholesale market? For ways to find confidence intervals for medians, please see Confidence Intervals for Median. – whuber Aug 21 '23 at 21:50
  • @Ute Yes, need confidence interval for each median home price for each market. – kll Aug 21 '23 at 22:32
  • @whuber Correct, there is exactly one home sale per transaction. I am calculating the median home price from the transactions in the market. – kll Aug 21 '23 at 22:34
  • In that case, it looks like the thread I referenced will fully answer your question: apply any of its solutions separately to each market, depending on how you choose to model this and how much data you have. – whuber Aug 21 '23 at 23:02

0 Answers0