I'm currently working on an event study to examine abnormal returns.
In the first step, I've calculated abnormal returns in regards to a certain type of company event, consisting of roughly 13,000 events and >4,000 firms.
In the second step, I intend to run a regression analysis with several (control?) variables to see whether some of the effect stems from certain aspects of the event.
So far so good, now I'm having the issue that I want to control for 5-6 factors like market capitalization and total enterprise value. Unfortunately, I don't have every single datapoint for every single of the 13,000 events. As an example, for Event 1 I'm missing market capitalization, for Event 2 the total enterprise value, for Event 3 the M/B-ratio and so on.
Question: Can I still run a meaningful regression even though I have a significant number of NA's or am I required to delete every single event with incomplete data? Given the poor data availability for some variables (which I generally still would love to include), that would result in a very large number of deleted events (>7,000).