15

This is a more mathematical question. Suppose I have a large representative set of Western music in C major. If I count the occurrence of the individual notes, what is distribution of the notes?

I would assume, that since the music is in C major, notes like C, E, G, F, A would have a much higher percentage than notes like C#, F#, or A#.

What is the percentage of individual notes in C major music?

(I am aware that this is a complex question. Taking duration into the consideration would be even more interesting. But I would be happy about any numbers/papers/sources to have a approximation)

hr0m
  • 271
  • 1
  • 6
  • Yes, i am aware that I could generalize the question, asking about the distribution of intervals given a key. Such an answer would be great as well. I just thought that C major is a more direct formulation. – hr0m Sep 12 '21 at 22:49
  • 5
    This question is fundamentally unanswerable without setting limits on all the different variable that can affect the count. As one example: If a piece is in C major, but modulates at various points, do the notes in the modulatory sections get counted? What about a piece in some other key that modulates to C major? Do we count pieces written before 12-TET was invented? – Aaron Sep 12 '21 at 22:54
  • 1
    If it helps, i would limit it to 12-TET music of pieces in C major without modulation. You know, the more boring stuff. As I said, pretty much any analysis would already help me. – hr0m Sep 12 '21 at 22:56
  • 4
    You might find this article interesting: it asks your question in reverse. Pitch-Class Distribution and the Identification of Key – Aaron Sep 12 '21 at 22:59
  • 3
    There also might be some interest in Normality Test for Distributions of the Music Metrics. (Disclaimer: I found this via Google search. I can't vouch for its veracity.) – Aaron Sep 12 '21 at 23:02
  • Have you tried? I imagine you're asking for more than personal curiosity. It would be interesting to design a program to do the analysis. OCR, maybe, or MIDI? By the way, narrowing the scope to "pieces that don't modulate" significantly narrows it to a subset of Western canon. Pretty much knocks Wagnerian modal mixture right out of the picture. – Andy Bonner Sep 13 '21 at 00:46
  • Actually, it is mostly personal curiosity. I want to use machine learning on a dataset of midi files. For that, I want the songs to be in the same key (CMaj or Amin). This simplifies the learning process. I am using music21 to do the transformation ( https://web.mit.edu/music21/doc/moduleReference/moduleStreamBase.html#music21.stream.base.Stream.analyze ). However, music21 is slow. So I tried to implement a simple stupid transformation myself. I smile take the piece, transpose it to all 12 possibilities and pick the one with least sharp notes, since I am in CMaj or Amin. It actually works. – hr0m Sep 13 '21 at 06:55
  • There are however some differences in the transformed dataset, so I wanted to look if it is "good enough". I had the idea to look at the distribution of notes. Now that I know the distribution, I can compare it to my two different solutions (music21's analyze, lest-sharp-notes-transpose).

    It is actually durable with music21, my "optimization" would not be necessary, yet it halves the execution time... But I got curious :)

    – hr0m Sep 13 '21 at 06:57
  • Since older western art music (aka classical) is in the public domain, it would be a large but very do-able project to analyze it all yourself. If you're not a programmer then you might try Stack Overflow for that. – Bennyboy1973 Sep 13 '21 at 05:36
  • I know one piece for which all 12 tones are exactly evenly utilized: https://www.youtube.com/watch?v=JTEFKFiXSx4 – Carl Witthoft Sep 13 '21 at 16:19
  • Luckily, I filter out empty or near-empty pieces in my dataset first :) – hr0m Sep 13 '21 at 20:49

1 Answers1

26

There's an article, "Pitch-Class Distribution and the Identification of Key", David Temperley and Elizabeth West Marvin, that give some information along this line. I got it on JSTOR but it was published in "Music Perception" which journal you might have access to. enter image description here

The distribution varies depending on the overall style (Baroque, Classical, Romantic, popular, jazz, etc.) There has been lots of work on the subject but much is only available from university libraries or behind paywalls.

There are some ambiguities in the original question. One can ask, "What is the distribution of tones by the number of occurrences? " or ask, " what is the total duration of each note?" These are not identical questions. The figure consisting of a chain of off-beat quarter notes (or half notes as in fourth species counterpoint) may be split into pairs of eighths or quarters respectively. These have the same harmonic and melodic significance but the counting methods may not agree.

Likewise, the existence of enharmonics matters from the point of view of the musical structures. An Ab7 chord is Ab-C-Eb-Gb (and normally resolves to a Db chord of some type) whereas the German Sixth Ab-C-Eb-F# usually resolves to a C64 chord. It's written with an F# to indicate the next note is G. It's a complication that you might wish to look into.

hr0m
  • 271
  • 1
  • 6
ttw
  • 25,431
  • 1
  • 34
  • 79
  • 1
    A perfect answer. Thank you for your contribution. Yes, you are right, my original question is a little naive, but for my purposes, this is more than enough. – hr0m Sep 13 '21 at 06:49
  • 1
    if you go to the hooktheory web site, you can find stats for pop songs and the author is very accessible; he gave me a dump of his database one time I wanted to do some analysis. – Thomas Sep 13 '21 at 12:41
  • 7
    Very interesting that the fifth is more common than the third in major keys, but their prominence is reversed in minor keys. – Michael Seifert Sep 13 '21 at 13:10
  • 1
    @MichaelSeifert: I don't know how to classify them, but I think there are multiple flavors of minor. Some minor-key pieces would make musical sense if the key signature were changed to the parallel major, but some wouldn't. – supercat Sep 13 '21 at 17:03
  • Although the graph for the major scale shows the expected pattern of diatonic notes occuring more often than notes which are foreign to the scale, I'm scratching my head about what the numbers on the y-axis are supposed to mean. In a naive interpretation where 0 means "never occurs", non-diatonic notes seem to be massively overrepresented. The major seventh, for example, seems to be only a bit more common than really strange notes (like b6 or b2) for which I can't name a single example piece of music off the top of my hat. – Marc Sep 14 '21 at 12:27
  • @Marc It says in the text that "Figure 3" is taken from openings of string quartets from Mozart and Haydn. Figure 2 is the picture above - but if the source material is similar, I'd guess the non-diatonic notes are coming from secondary dominants. – Daniel Sigurdsson Sep 14 '21 at 15:15
  • @DanielSigurdsson: Thanks, that's probably at least part of the explanation. The numbers still feel a bit inflated but I'm not very familiar with this kind of music and don't have a good feeling for the prevalence of secondary dominants there. – Marc Sep 14 '21 at 16:02