Question 1

What is Common Voice dataset?

Accepted Answer

Each entry in the dataset consists of a unique MP3 and corresponding text file. Many of the 1,368 recorded hours in the dataset also include demographic metadata like age, sex, and accent that can help train the accuracy of speech recognition engines. The dataset currently consists of 1,087 validated hours in 18 languages, but we're always adding more voices and languages.

Question 2

What type of tool is Common Voice dataset?

Accepted Answer

Common Voice dataset is an AI tool focused on audio-data, speech-recognition, training-data.

Question 3

Who makes Common Voice dataset?

Accepted Answer

Common Voice dataset is made by Mozilla (https://voice.mozilla.org/en/datasets).

Common Voice dataset

About Common Voice dataset

Resources

Product Website