So, I used Pearson's R in one my scholarly research papers, but I'm not sure which of the many historical papers I should cite. What kind of source does the community typical cite? On Wikipedia there are for example 6 papers cited in the introduction.
-
6Why cite any at all? I am all for historical scholarship but I've not usually cited an authority for correlation in any paper that used it. (The one exception I can recall did have strong historical flavour. http://www.stata-journal.com/sjpdf.html?articlenum=pr0041. Would you (want to) cite an authority for the mean or median or the summation sign? Stuff that is in any introductory text hardly needs an authority. – Nick Cox Dec 16 '16 at 09:07
-
1This was named after Karl Pearson but may have been used by others before him notably Sir Francis Galton. If you are going to cite just one which one should we take? – Michael R. Chernick Dec 16 '16 at 19:13
-
1@NickCox Good for you for citing Karl Pearson's 1896 paper. Galton was publishing about regression (a term we think he coined) around the same time and in simple linear regression he knew about how the slope parameter in regression related to the Pearson correlation. – Michael R. Chernick Dec 16 '16 at 19:20
-
The history is complicated but Pearson getting the credit is, for once, not a travesty of history. He built on Galton's intuitions and explorations and Yule in turn had (by modern standards) a better idea of what was most important about correlation. – Nick Cox Dec 16 '16 at 20:27
-
@NickCox This not a bad answer, perhaps you can post it as a formal submission. – Mikhail Dec 17 '16 at 01:33
-
@NickCox I think different people have different opinions and I don't think it is difficult to know how to attribute credit in citations when we weren't around at the time. – Michael R. Chernick Dec 17 '16 at 13:58
-
2I always cite the ancient Babylonians whenever I add or multiply numbers in a paper. :-) – whuber Jun 03 '19 at 13:51
3 Answers
In the case of general knowledge like this, it is often better to use some suitable reference (book/paper explaining the term/method you use in modern language) rather than chase perfect original prehistoric reference. You just want to make sure that reader knows what do you mean by the Pearson's R.
For this one, I would follow: http://citebay.com/how-to-cite/pearson-correlation/
- 155
- 5
-
2Much better to cite a standard textbook. Freedman, Pisani, Purves Statistics is much better than that arbitrary reference. Prehistoric is the wrong word for late 19th century work: many of the key journals are widely accessible online. – Nick Cox Jun 03 '19 at 12:28
-
1The reference is wrong any way: the material is not pages 1 to 4 of the book cited! – Nick Cox Jun 03 '19 at 12:33
-
-
-
1I am not sure whether I am pleased that this site is willing to change or worried that it will do so just on someone's say-so. Naturally I believe my own judgment on this is good, but why would anyone else change their minds without knowing me? – Nick Cox Jun 03 '19 at 15:17
Pearson's sample correlation statistic is sufficiently well-known in statistics that it would be acceptable to use it without citation. If you would like to include a citation, a good one is Rodgers and Nicewander (1988), which gives historical information on this statistic, and thirteen different interpretations of the statistic. This is a nice reference for most readers, since it gives them a range of possible interpretations.
- 124,856
-
There was a sequel by others: Rovine, M. J., and A. Von Eye. 1997. A 14th way to look at the correlation coefficient: Correlation as the proportion of matches. American Statistician 51: 42–46. FWIW, I don't agree especially that either is a good stand-alone reference: those who need or want a reference are more likely to appreciate something basic, but authors have the responsibility to think about their intended readership. We can't do for them. – Nick Cox Jul 23 '20 at 10:05
If there is a need to describe why the Pearson correlation coefficient is used instead of other indices, then a publication that does a comparison study will be desirable. For example I may cite the example below for that purpose.
"Correlations between variables can be measured with the use of different indices (coefficients). The three most popular are: Pearson’s coefficient, Spearman’s rho coefficient, and Kendall’s tau coefficient Hauke, J., & Kossowski, T. (2011)
- 56,404
- 8
- 127
- 185
- 180
-
1I like your first sentence as a key idea, but describing what is popular doesn't explain why something is a good idea. See any day's news for counter-examples. – Nick Cox Jul 23 '20 at 10:08