Write $f_\theta(x)=\log p_\text{model}(x;\theta)$ to simplify formulas. The idea holds for any function.
When you divide the expression $\displaystyle\sum_{i=1}^m f_\theta(x^{(i)})$ by $m$ you get the empirical mean of $f_\theta(x)$: $\frac{1}{m}\displaystyle\sum_{i=1}^m f_\theta(x^{(i)})$.
By definition the empirical distribution is $\hat p(x)=\frac{\#x}{m}$ where $\#x$ is the number of time $x$ appears in the dataset. Now the only thing left to understand is that the empirical mean of a function is the same as its mean given the empirical distribution. To see this fact, just write:
$$\frac{1}{m}\displaystyle\sum_{i=1}^m f_\theta(x^{(i)})=\frac{1}{m}\sum_x\sum_{x^{(i)}=x}f_\theta(x)=\frac{1}{m}\sum_x\#xf_\theta(x)=\sum_x \hat p(x)f_\theta(x)$$
Note: the double summation is simply grouping by values of $x$. To get the intuition, imagine you have to sum a lot of terms each being either 2,3 or 5. You can first sum the 2s, then sum the 3s then sum the 5s and add the three sums. It's what the formula is.
The expression on the right is by definition the expected value of $f_\theta(x)$ given the empirical distribution $E_{x\sim\hat p}f_\theta(x)$. So finally, the function in the second $\arg\max$ is just the function in the first $\arg\max$ divided by $m$. It is pretty clear that maximizing one or the other is the same.