4

I have a WAV file contains a subject speech. The subject speaks a sentence once at a time, then a short period of silent appears. I'm interested to analyze the phonemes of that speech and what time each phoneme occurs. For instance, I am looking for something like this:

6.5-6.8 'AE'
6.8-7.0 'NG'

Is there any software supports such a thing?

Sir Cornflakes
  • 30,154
  • 3
  • 65
  • 128
cyberic
  • 143
  • 1
  • 5

2 Answers2

7

Broadly, you're describing the entire (not-fully-solved) problem of automatic speech recognition/automatic transcription.

However, if you have the text of the sentences (e.g., if the recordings are scripted, or if you've manually transcribed their speech), then the problem is more tractable: you want 'forced alignment'.

A popular software option for that is the Penn Phonetics Lab Forced Aligner (available at http://web.sas.upenn.edu/phonetics-lab/facilities/). There is documentation, but you might also do a web search for tutorials and guides.

Jeremy Needle
  • 2,522
  • 1
  • 13
  • 13
  • The software you mentioned is perfect if you have the text. I validated its output using Praat software (aurally and visually) and it was quite accurate. – cyberic Dec 19 '18 at 22:50
4

Praat is the main program used to analyze sound data for phonetics research. It's available for free at the link. You can use the program to add markers and replay snippets, as well as analyze formants.