If you look at a zero-mean MA process:
$X_t = \varepsilon_t + \theta_1 \varepsilon_{t-1} + \cdots + \theta_q \varepsilon_{t-q} \,$
then you could regard the right hand side as akin to a weighted moving average of the $\varepsilon$ terms, but where the weights don't sum to 1.
For example, Hyndman and Athanasopoulos (2013) [1] say:
Notice that each value of $y_t$ can be thought of as a weighted moving average of the past few forecast errors.
Similar explanations of the term may be found in numerous other places. (In spite of the popularity of this explanation, I don't know for certain that this is the origin of the term, however; for example perhaps there was originally some connection between the model and moving-average smoothing.)
Note that Graeme Walsh points out in comments above that this may have originated with Slutsky (1927) "The Summation of Random Causes as a Source of Cyclical Processes"
[1] Hyndman, R.J. and Athanasopoulos, G. (2013) Forecasting: principles and practice. Section 8/4. http://otexts.com/fpp/8/4. Accessed on 22 Sept 2013.
I have to say, I have had moments of confusion when switching from reading "moving average" in the time-series analysis literature to "moving average" in the technical analysis literature!
It'd be nice to know who made the first reference to the term. Track that information down and you might get the "why" answer that you're looking for.
– Graeme Walsh May 06 '13 at 06:46