I have trouble deriving the EM algorithm for the Probabilistic latent semantic analysis (pLSA) model via Lagrange multipliers.
I model the missing data $Q_{zij} \in \{0,1\}$ for word $w_j$ in document $d_i$, which gives rise to the variational distribution over $z: q_{zij} = P(Q_{zij} = 1), \sum_z q_{zij} = 1, q_{zij} \geq 0$. Then I derive a lower bound via Jensen's inequality and arrive at the optimisation of the log likelihood over $q$ for a fixed $u_{zi}, v_{zj}$ via Lagrange multiplier:
$\cal{L}(q, \lambda) = \sum_{z=1}^K q_{zij}[\log u_{zi} + \log v_{zj} - \log q_{zij}] + \lambda(\sum_{z=1}^K q_{zij} - 1)$
Applying the first order optimality condition, which is taking the partial derivatives with respect to $q_{zij}$ I get:
$\lambda + (\log u_{zi} + \log v_{zj} - \log q_{zij} -1) = 0$
This now leaves me with $K + 1$ equations for $K+1$ unknowns, which are $\lambda$ and the $K$ $q_{zij}$ values. However, I don't know how to actually solve this. I know that the solution should be
$q_{zij} = \frac{v_{zi}u_{zj}}{\sum_{p=1}^K v_{pi}u_{pj}}$ which is just the posterior of $Q_{zij}$ if I expand $v$ and $u$ to their respective pdfs.
How do I solve this to properly derive the E step?