I am reading the paper: Convergence of a stochastic approximation version of the EM algorithm to implement this algorithm for a probability model I already have. In p. 3, the paper summarises the algorithm as follows.

I am stuck at the E (or S) step in this algorithm. In a typical EM setup, one maximizes the integral $$Q(\theta)= \int \log p(x,y |\theta) p(x|y,\theta) dx$$ In here, this integral is estimated via the samples simulated from the posterior. I understand this. But, what I do not understand is: Do authors propose to 'update' the cost function? If so, how can we maximize the new $Q$? May be I am missing a very obvious thing and can not see how to implement this algorithm (I do not understand updating the 'cost').
Thanks in advance!