The CA Secretary of State has historical records on ballot measures starting in 1912, and we're attempting to calculate probabilities for the 2024 ballot measures based on the historical average. Over more than 100 years, ballot measures, once qualified and put on the ballot, succeed 35% of the time. Since a ballot measure either succeeds or fails on a majority vote, a binomial distribution seems appropriate, though we are not completely certain that a variation wouldn't be a better fit, especially since the measures themselves are likely not fully independent sequential events. From other posts, we see that a quasi-binomial, negative binomial, poisson and beta binomial distributions are other possibilities. How can we choose the best distribution for this situation? Do we need to run a few models with different parameters? How would we compare them?
As well, each election is distinct, so there could be some lurking variables that might lend themselves better to a variety of linear model? For example, amount of money raised, midterm election or not and the party affiliation of those backing the ballot measure. Rather than controlling for too many of these, which are difficult to collect, perhaps it would be enough to use an average p of success from a subgroup of more recent, similar elections? For example, only using the last 30-40 years and separating presidential elections from midterm elections.
If we set these concerns aside and use the 35% p of success on the current 7 ballot initiatives set to appear on the 2024 ballot, the binomial probability that 2 succeed is about 30%, and the cumulative probability that 2 or fewer succeed is about 53%.
Any thoughts and guidance would be helpful and appreciated!