Further research has thrown up this webpage at purdue.edu which links to source code for various variants of PLS. On the latter page, the PLS1 method appears to be very similar to the algorithm shown on the PLS regression Wikipedia page.
The purdue.edu implementation cites "Overview and Recent Advances in Partial Least Squares" by Roman Rosipal and Nicole Kramer, LNCS, 2006 (PDF here or here) as a source.
Comparing the purdue.edu PLS1 function with the Wikipedia algorithm, it appears that the latter is an optimised version of the former, and probably cannot be optimised / simplified any further.
My original question came about because the Wikipedia algorithm has no 'inner loop', which I now understand comes about as a result of it being an algorithm specifically for a single response variable, which allows further significant simplification.
Rosipal and Kramer repeatedly cite "PLS Regression Methods" by A. Höskuldsson, Journal of Chemometrics vol 2: 211-228 (1988) which is another excellent reference. It states that "if ... Y is a one-dimensional vector ... the NIPALS procedure converges in a single iteration".
I also found this paper very useful.