There is a difference between entropy and number of microstates when dealing with a random process that is not equally probable. In the example of a single coin flip there are only two microstates regardless of coin bias, the coin can come up heads or tail but the entropy will be different for the cases of even coins or biased coins. For an even coin the entropy can be calculated the usual way,
$$H(X) =- \sum_{i \in h,t}p_i \log_2 p_i =- (0.5 \log_2 0.5 + 0.5 \log_2 0.5) = 1\;\mathrm{bits}$$
or because each microstate is equally probable $H(X) = \log_2 2 = 1 \;\mathrm{bits}$
For the biased coin where heads has a probability $p_h = 0.3$ the entropy is,
$$H(X) = - ( 0.3 \log_2 0.3 + 0.7 \log_2 0.7) = 0.88\;\mathrm{bits}$$
The entropy for the biased case is lower because we are less uncertain about the outcome of a coin flip (our intuition tells us that tails is more likely to occur). Another simple example is if we have a random process where we take two coins and flip them then there are four possible microstates $X =\{hh,ht,th,tt\}$
For even coins where each microstate is equally probably the entropy $H(X) = \log_2 4 = 2 \;\mathrm{bits}$
and for two biased coins the entropy is $H(X) = 1.76 \;\mathrm{bits}$ .
Again the entropy of the biased coins is less than the equiprobable case because we know that the coins are weighted towards tails.
Entropy is really tedious to understand because it has use in chemistry, statistical mechanics, and information theory. In my opinion the best and most clear understanding of entropy is "Where We Do Stand on Maximum Entropy"[pg. 12-27] by E.T. Jaynes
We can call log(1/) information. Why? Because if all events happen with probability , it means that there are 1/ events. To tell which event have happened, we need to use log(1/) bits (each bit doubles the number of events we can tell apart).But how do we underatand this if all events are dont happen with probability , meaning different microstates have different probability – GENIVI-LEARNER Apr 06 '20 at 13:03If the entropy is less than the equally probable distribution entropy then a microstate must be more probable and a microstate must be less probablewhat did you mean bythen a microstate must be more probable and a microstate must be less probable– GENIVI-LEARNER Apr 06 '20 at 17:06