0

I have a couple of problems trying to understand the exact formula for cross entropy loss. Depending on the source I see it written different ways.

  • Is the log() function $\log_2()$?
  • Is the argument in the log: $q$ or $1/q$?

I am fairly certain it is $\log_2$ and $1/q$ but the variants worry me.

Why am I seeing it different ways? Could someone not only confirm my understanding but also suggest why there are so many variants out there?

  • 5
    The base of the log is not important in many applications. For example, in model comparison, changing the base of the log will not change the ordering of the cross-entropy or the K-L divergences. $\log_2$ is used where expressing the information in bits is convenient, while $\ln$ is often used as it is easier to deal with it mathematically.

    With regard to the second question, $\log(p) = -\log(1/p)$. So, I would check for a minus sign in the variants.

    – baruuum Dec 07 '18 at 05:23
  • Perfect. Feel free to make this an answer and I’ll accept it. – Robert Lugg Dec 07 '18 at 06:38

1 Answers1

4

The base of the log is not important in many applications. For example, in model comparison, changing the base of the log will not change the ordering of the cross-entropy or the K-L divergences. $\log_2$ is used where expressing the information in bits is convenient, while $\ln$ is often used as it is easier to deal with it mathematically.

With regard to the second question, $\log(p)=−\log(1/p)$. So, I would check for a minus sign in the variants.

Tim
  • 138,066
baruuum
  • 663
  • 6
  • 14