As promised before in a comment, I made a corpus analysis of whether word length, in terms of number of syllables, is correlated with position in the verse.
Spoiler alert: it isn't. brianpck has already done a basic corpus analysis with essentially the right conclusion, but I intended to do this as thoroughly as possible in order to hone my primitive skills as a statistician (meaning: please respond to any errors I may have made), and to make good on the aforementioned promise.
Input corpus
I used a hexameter corpus consisting of:
- Vergilius, Bucolica
- Vergilius, Aeneis
- Ovidius, Metamorphoses
Caveats
Only verses that are scanned correctly by my Latin verse scanning tool are taken into account. This is 94.7% of the existing corpus (21525 out of 22723 verses) as of the time of this writing. There's a whole host of reasons why a verse wouldn't be scanned correctly, e.g.:
- hiatus in the verse
- hypermetricality (elision over verse boundaries)
- unfinished verses
- multiple possible scans
- non-intuitive scansions of foreign words
- other flukes
There are a small number of words that just cannot be determined correctly e.g. the word VOLVI can be either bisyllabic volvi (< volvere) or volui (< velle). There is no way to distinguish between these forms if the vol- part is in the second half of a foot: vo̱l|vi or vo̯lu̯|i, except by leveraging dictionary or even semantic knowledge. But the occurrence of this kind of problem should be rare enough to not influence the data significantly.
I didn't check for elided syllables, so e.g. multum ille et in Verg. Aen. I, 3 is counted as a sequence of words of 2, 2, and 1 syllables.
Data analysis
I compared the actual vs the expected word length of words in verses: this essentially normalized the data to always have an expected value of 1, in order to allow using the data set of the 4-word verses together with the 11-word ones.
For example, any word in a verse consisting of 5 words that is 15 syllables in total, has an expected syllable count of 3. A verse consisting of 10 words and 14 syllables in total will contain on average a word of 1.4 syllables.
By doing the calculation (actual * number of words / number of syllables), any word in any verse has an expected normalized weight of 1, no matter how many words and how many total syllables the verse contains.
My results are:
+---------------+-------+-------+-------+-------+-------+-------+-------+------+------+------+------+
| word position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 |
+---------------+-------+-------+-------+-------+-------+-------+-------+------+------+------+------+
| count | 21525 | 21525 | 21525 | 21525 | 21456 | 18340 | 10133 | 3308 | 636 | 76 | 7 |
| average | 0.95 | 0.92 | 0.97 | 0.98 | 1.03 | 1.07 | 1.12 | 1.17 | 1.20 | 1.17 | 1.43 |
| st dev | 0.40 | 0.33 | 0.35 | 0.35 | 0.39 | 0.37 | 0.34 | 0.33 | 0.38 | 0.44 | 0.43 |
| variance | 0.16 | 0.11 | 0.12 | 0.12 | 0.15 | 0.13 | 0.11 | 0.11 | 0.15 | 0.19 | 0.19 |
| # of st devs | 0.12 | 0.25 | 0.07 | 0.07 | 0.08 | 0.19 | 0.35 | 0.52 | 0.53 | 0.38 | 0.98 |
+---------------+-------+-------+-------+-------+-------+-------+-------+------+------+------+------+
+---------------+-------+-------+-------+-------+-------+-------+-------+------+------+------+-------+
| word position | -1 | -2 | -3 | -4 | -5 | -6 | -7 | -8 | -9 | -10 | -11 |
+---------------+-------+-------+-------+-------+-------+-------+-------+------+------+------+-------+
| count | 21525 | 21525 | 21525 | 21525 | 21456 | 18340 | 10133 | 3308 | 636 | 76 | 7 |
| average | 1.01 | 1.24 | 0.94 | 0.94 | 0.96 | 0.96 | 0.91 | 0.87 | 0.82 | 0.83 | 0.66 |
| st dev | 0.25 | 0.37 | 0.37 | 0.33 | 0.35 | 0.39 | 0.40 | 0.39 | 0.36 | 0.37 | 0.02 |
| variance | 0.06 | 0.14 | 0.14 | 0.11 | 0.13 | 0.15 | 0.16 | 0.15 | 0.13 | 0.14 | 0.00 |
| # of st devs | 0.06 | 0.64 | 0.17 | 0.18 | 0.10 | 0.09 | 0.21 | 0.34 | 0.49 | 0.45 | 16.77 |
+---------------+-------+-------+-------+-------+-------+-------+-------+------+------+------+-------+
word position -1 means the last word in a verse, -2 the second to last, etc
The last line of every table is the most important one, because it points to how significant the difference is between actual and expected data. In order to be statistically significant, we want the value in this row to be at least 2.
Conclusion
So the only place where the actual length of words deviates significantly from the expected value is the eleventh-last word in verses of 11 words long, so the first. All 11-word verses start with a monosyllable, but given that there are only 7 such instances in the entire corpus, this is negligible. If any of these 7 instances hadn't had a monosyllable as a first word, then the sigma value would've been 1.30 instead of 16.77.
All other values are not statistically significant, so no, there is no correlation between word position and word length expressed in syllables.
FYI I can provide the raw data on request