When reading papers on machine learning, I have found that authors would often reference the "Shannon entropy". Curiously, often times the equation given would be:
$$H(p) = -\sum\limits_{i = 1}^n p_i \ln(p_i)$$
For instance, see:
https://arxiv.org/pdf/1502.00326.pdf
https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2014-121.pdf
There are a lot more
The problem is that for anyone who has ever taken a course on information theory, the logarithmic term in the entropy definition is base $2$, not base $e$. So they are referring to some more like Gibbs entropy instead of Shannon entropy.
Whereas the definition in this paper is correct to me: http://www.fizyka.umk.pl/publications/kmk/08-Entropie.pdf
Has anyone else noticed this phenomenon? Would there be a problem if one used Gibbs entropy in place of Shannon's entropy?