6

Where can I find an Italian phonetic transcription dictionary like CMUDict is for US English?

CMUSphinx says that they have Italian pronunciation dictionary, but I cannot find it for download anywhere.

If there is no such a thing, from which dictionary can I easily extract Italian phonetics?

Charo
  • 38,766
  • 38
  • 147
  • 319
user3956
  • 61
  • 2
  • Welcome to Italian.SE! At the following link you can find our list of resources about Italian. – Charo Nov 01 '17 at 22:43
  • What are CMUDict and CMUSphinx? – DaG Nov 01 '17 at 23:06
  • @DaG : CMUSphinx is voice recognition engine developed at Carnegie Mellon ... University. It uses dictionaries that are mapping words with their pronunciations. CMUDict example: HELLO HH EH L OW. Having such a file is of great help in TTS or VR development. – Dalen Nov 02 '17 at 00:27
  • Thanks, @Dalen. This kind of phonetic coding with EH and OW is not used in Italy, so the OP will probably have to settle for something using IPA or other codings. – DaG Nov 02 '17 at 09:01
  • No, they use their own notation which is, sometimes, more human friendly. But it is easily convertable to IPA, SAMPA or whatever you need. Their notation sometimes has drawbacks as well. For instance in CMUDict, at least in version I currently have, the "A" in additional and in button or about is both defined as "AH", which is absolutely unacceptable. – Dalen Nov 02 '17 at 09:14

3 Answers3

7

A good Italian pronunciation dictionary is DOP, Dizionario d’ortografia e di pronunzia della RAI. At this dictionary, you can see phonetic transcriptions and listen to the pronunciation of words.

Charo
  • 38,766
  • 38
  • 147
  • 319
3

I found the CMU Sphinx's whole acoustic model for Italian. The *.tar.gz contains the pronunciation dictionary. And it is, unfortunately, terrible.

For example, according to it "zucchero" should be pronounced as if written "zucero", digraphs like "ll" aren't referenced as one phoneme and some accents are missing, and so on.

You can download the model at: https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/Italian/

Then you find the *.dic file within the archive.

But there must exist some dictionary that can be turned into similarly mapped file. Does anybody know about it? A PDF or something!

Dalen
  • 131
  • 4
  • I don't think “ll” is a digraph in Italian: it denotes gemination of the “l”, and essentially all consonants can be geminated (some are “automatically” geminated, like intervocalic “z” in “azione”). – egreg Nov 02 '17 at 11:35
  • Well, yes, you're right. But phonetically speaking "l" and "ll" result in two different phonemes. So I didn't know how to call it properly. A proper digraph would be "gn". – Dalen Nov 02 '17 at 13:32
  • Certainly so. In Italian, gemination is phonemically distinctive: “pala” and “palla” are different. – egreg Nov 02 '17 at 13:35
0

I recently released WikiPronunciationDict, a multilingual pronunciation dictionary based on data from Wiktionary. It currently contains pronunciations for about 90,000 Italian words.

Here's a direct link to the Italian data file.

Daniel Wolf
  • 101
  • 1