I'm looking at whether high school start times influence test scores. Currently, I have the individual school start times and the average test scores for around 40 different high schools for each academic school year between 2012-2019.
This is for a school project and I don't have much experience with statistics and am relying on excel.
I was wondering, is there a way to see whether changes in school start times over a period of time would correlate with changes in test scores over that certain period of time?
I know that if I was just connecting a start time with its corresponding test scores, I can just have two columns of data (one for start times and another for test scores) and perform a regression analysis between the two columns, I think. But this loses out on the factor of time, is there a way to also include time within this analysis?
Updated w/ Link to Spreadsheet: https://epscloud-my.sharepoint.com/:x:/g/personal/198890_apps_everettsd_org/EUocdmeDXclOt9mkXw-DcdwB3FbIU9t4WyLKNH5Jk9MAyw?e=K3rxuU
Apologies for the messy spreadsheet, but I have two tables, one for SAT scores and one for school start times, and it sorted for each high school. For instance, Ballard in 2012-2013 has a SAT score of 590 (in cell B3) and a start time of 0.83 hours past 7:00 AM (in cell B16). I don't know if I can sort it out per high school, so I tried to just pair up each SAT score with its corresponding start time on the tall table to the right and I used regression with a y-input of SAT scores and x-inputs with both school start times and # years after 2010.
The spreadsheet above and this question on cross validated were part of a larger data set, but I excluded it to more easily ask the question. However, here is the link to the larger data set: https://epscloud-my.sharepoint.com/:x:/g/personal/198890_apps_everettsd_org/Efuy5OT13rJLnFUZpPPQ5ucBn4CzjmKsxmnGUh1Fii-FWw?e=UMeGEL
The table beginning in A1 is SAT ERW scores, A41 is SAT Math scores, and the three tables starting from A81 horizontally across to S81 is changing the start times to # of hours past 7:00 AM.
Further to the right, [Table 1] is a correlation between school start times (SSTs) and ERW scores. [Table 2] with SSTs and Math scores. [Table 3] was an attempt with #years after 2010, SSTs, and ERW. [Table 4] with #years after 2010, SSTs, and math. [Table 5] under Table 1 was I believe using Table 1 but doing a regression instead of correlation.
In reality, there are a lot of other assumptions to check. For example, a 0.05 cutoff is not always optimal. This is a good read on some techniques to learn: https://stats.stackexchange.com/questions/3200/is-adjusting-p-values-in-a-multiple-regression-for-multiple-comparisons-a-good-i
But be sure to add an effect for each school in your model, at the very least.
– Estimate the estimators Apr 10 '22 at 21:33