How to interpret the proof that information cascades will form?

Question

I am reading the 1992 paper of Bikchandani, Hirshleifer and Welch on information cascades. They claim and prove that, given an environment of sequential decision making, an information cascade will always form. The proof is in the appendix of their paper and involves some statistical arguments that are not clear to me. A simpler proof in the body of the paper is considering groups of two individuals making a decision in sequence. For n even, P(no cascade after n individual decisions in sequence) = P (no cascade after first 2 people) * P(no cascade after next 2 people) * ... * P(no cascade after last 2 people) that ends up being $(p-p^2)^{n/2}$.

This argument is clearer to me, but it seems made after the realization that two identical announcements are needed to start a cascade. Also, it seems calculated from the perspective of being outside the experiment, e.g. the moderator of the experiment who knows that p means a H (high signal) and 1-p refers to L (low signal) and not from the perspective of participants who should take into account being in both states of the world. This is my understanding of the above term.

Why is there no loss of generality in arriving at $(p-p^2)^{n/2}$ not considering both states of the world?

I have also been reading Banerjee's 1992 paper on herding. I would be very grateful for any sources that explain these papers and any notes on information cascades.

Addition: I understood how to interpret the formula above by reading Zhukov's slides and looking at the tree diagram he displays from Hirshleifer: http://leonidzhukov.net/hse/2014/socialnetworks/lectures2/lecture6.pdf. My problem was with interpreting the tree diagram that he shows, but I understood it after I accompanied Zhukov's slides with what Bikchandani et al. wrote. Thank you to the below commenter for the answer, too.

your belief of whether the state is high or low can be represented by a single number. $p$ describes simultaneously the probability of each state, since if $p$ is the probability of high state, then $1-p$ is for the low state and vice-versa. There is no loss in generality. If you rather want to express probabilities in terms of the low state simply let $p=1-q$, and $1-p=q$ where $q$ is the prob of a low state. The analysis with respect to $q$ would be the same. — Regio, Apr 22 '19 at 05:32
It is customary here to post an answer yourself instead of editing it into the original question. That way the question does not remain "unanswered". It would be great if you could put your insights into a reply and also briefly sum up what you learned from the slides instead of only setting a link. — Bayesian, May 13 '19 at 15:18

How to interpret the proof that information cascades will form?

0 Answers0