1

I have been working on a machine learning project for speaker recognition. I need more audio files from different speakers to improve the accuracy of my algorithm*.

I'm using the files to recognize who is speaking. It's not a speech to text algorithm.

I'm using just digits (initially only zero) to simplify my task, because it's still a proof of concept.

*I tested my machine learning with just 10 speakers and it gave me 55% accuracy. I want to add more samples to get a higher percentage.

1 Answers1

0

There are multiple datasets of spoken digits.

AudioMNIST

30000 audio samples of spoken digits (0-9) of 60 different speakers. Download. Paper

Free Spoken Digit Dataset (FSDD)

5 speakers, 2,500 recordings (50 of each digit per speaker). Download

Jon Nordby
  • 211
  • 1
  • 4