0

I have a dependent variable that is a ratio of how many investments in a year were made in a certain industry. So e.g. the firm made 4 investments and two of those in the technology sector which would give me a ratio of 0.5. The more investments were made the more granular the ratio might be.

Since the variable can also take on values of 0 and 1, but also values in between, I am not sure as to what model would be best to estimate the effect of my independent variable on the ratio of investments in a certain industry.

Standard logistic regression models the probability of a binary outcome, typically using binary predictors (0 or 1). Applying it directly to proportions (which include values between 0 and 1) might not be the best since logistic regression assumes a Bernoulli distribution for the response variable.

I am thinking of either log transforming the dependent variable which will make interpretation harder and just using linear regression or using a fractional logistic regression which can handle dependent variables as a fraction or proportion that are not strictly binary outcomes.

Any suggestions on this would be highly appreciated. I am not very well-versed with statistical analysis so high level answers would be prefered.

lazer
  • 31
  • 1
    Why would a logistic regression, with the outcome then being "probability of investment being in the technology sector", not be appropriate? – PBulls Mar 29 '24 at 16:40
  • Please read the question and answers on the page linked as a duplicate of this one. In R you can perform logistic regression on a two-column matrix of counts of "successes" (investments in a specified industry) and "failures" (investments in all other industries), if there is just one industry in which you are interested. If that doesn't answer your question, please post a new question specific to the issues that remain unresolved. – EdM Mar 29 '24 at 18:23

0 Answers0