0

I have a problem related to merging three different datasets, and then performing regression analysis on them. I am using CRSP stock data. Two of the datasets are equal length with 129 monthly entries. However, the first problem I encounter is that my third dataset is not equal length, as it is formed according to individual ids, "PERMNOs", and not according to "DATE" like the two other datasets. The "PERMNO's" are also varying in length with respect to DATE, and for example this first PERMNO goes from 20110131 to somewhere around 2018. However, the other PERMNOs could be something like 2013-2016, or 2017-2021, and there are around 6300 different PERMNOs with varying lengths and dates ranging from 2011 to 2021.

Now my end game would be that I want to do regression analysis with respect to each of the PERMNO, and use every PERMNOs individual monthly "RET" as the dependent variable, and then monthly values for Mkt.RF, SMB, HML,RMW,CMA and MOM as independent eplanatory variables. So, I would have to match each PERMNOs dates with correct dates on the independent variables also obviously. The trouble is also for me that in the PERMNO dataset the DATE variable is defined in more detail than in the two other files, so I'm not sure if they can be directly matched. I believe that I need also some kind of loop method to do the 6300 different regressions.

I'm new to r-coding.

Picture

Picture second

Third

ouflak
  • 2,408
  • 10
  • 40
  • 47
hmmp
  • 1
  • 1
    Welcome to Stack Overflow. Please don’t use images of data as they cannot be used without a lot of unnecessary effort. [For multiple reasons](//meta.stackoverflow.com/q/285551). You’re more likely to get a positive response if your question is reproducible. [See Stack Overflow question guidance](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – Peter Jan 20 '22 at 15:53

0 Answers0