Improve performance of speech recognition

Question

Could anyone recommend good speech pre-processing/filtering algorithms/methods that would improve the performance of my speech recognition. I have been primarily been doing image processing related stuff and hence am not much aware of speech pre-processing methods.

I am currently using Google's Web Speech API for speech to text recognition. This is the current flow of things.

Take input from microphone
Segment the audio using Voice Activity Detection. This is done since Google's API has a limit on the length of the segment it can process.
Send the segment to Google's web speech API (on a thread)
Print the response of the API

Using this I could only get up to 60-65% conversion accuracy. Is there any way to improve this? Would doing a pre-processing on the audio (noise-removal, filtering, etc.) help?

Please be clear about your question. DSP.SE is not a forum but a formal Q&A site. It will welcome all questions where you apply signal processing algorithms. A simplest answer to your above question is - now that you know Google API gives 60% accuracy, try a better algorithm! Please ask only a specific question pertaining to specific algorithm. — Dipan Mehta, Jun 15 '15 at 14:46
I am not aware of 'regular' speech pre-processing methods (if they are used). Also if someone has already used this API with more success, I would like to know what steps they had taken to make it work. — patel deven, Jun 19 '15 at 09:57

Improve performance of speech recognition

0 Answers0