3

Why do people use log(x+1) or log(y+1) as their independent/dependent variable? In these cases, is it that different from using just log(x) or log(y)? What is the benefit of using log(x+1) or log(y+1)? Sometimes when I read papers, the authors specify that their model's variable is the log of a variable + 1. I have been trying to read through the posts on why this is the case, but still having a hard time figuring it out.. can someone explain to me please?

  • 2
    In this comment I would like to formulate Dave's answer a bit more exaggerated: 'people use log(1+x) when they actually would like to use log(x) and they do not know how to properly deal with negative values of x' – Sextus Empiricus Oct 31 '22 at 08:27

1 Answers1

3

The answer is that people do it when they have zero-values in their data. When that happens, you cannot log your data, since $\log(0)$ is undefined. However, if you add some small value, such as $1$, then you can take the logarithm of that small value.

This practice, however, is of questionable statistical validity.

Dave
  • 62,186
  • $log(x+1)$ makes the function to map positive values to positive values. This in certain domains is desirable – gota Oct 31 '22 at 11:27