I'm trying to tag taboo words in a corpus. I wonder if there are any objective and universal criteria to identify them.
I know taboo words are culturally determined, but I want to be as objective as possible.
I'm trying to tag taboo words in a corpus. I wonder if there are any objective and universal criteria to identify them.
I know taboo words are culturally determined, but I want to be as objective as possible.
This would be a property of individual people, and not words, so your only hope is for a pairing of words and subject judgments. Even then, you have to devise a definition of "taboo" as applicable to words. "Taboo" is a relatively strong negative concept, stronger than "rude" or "coarse", therefore you presumably don't want to conflate all "not nice" words. These negative evaluations are highly contextual, hence there are a number of words that I can easily use on a casual conversation with camping buddies that I would not utter in a formal public lecture. So to ask a person if they consider the B word to be "taboo", you would also have to say what the context is (and whether or not it is a noun or a verb, or to whom/what it is attributed).
An alternative, one that is mechanically doable but unsatisfactory, is to check with an authoritative dictionary of English to see if a word is marked as "vulgar", "rude" or something like that. This would only work for English, and you could never address the case that an individual cannot say the word for "milk" in Kerewe between the word too-closely resembles his mother-in-law's name, and it is taboo to speak your mother-in-law's name in that culture.