Proceedings of The 12th Language Resources and Evaluation Conference (LREC 2020).
Abstract Read Paper Dataset Code
Embedding Projections
The visualization of MFCC vectors of speech samples from the dataset provides certain insights into the distribution of AccentDB. We share multiple embedding visualizations of our dataset - with 4 accents and 9 accents. We include 150 files from each class in the TensorFlow Embedding Projector tool below (please choose Color by -> Label
in the left menu to differentiate between the classes). You can also perform PCA, UMAP and t-sne decomposition. Read more in section 2.6 or download processed vectors and metadata
Explore Embeddings
Dataset
The current release v1.0 of AccentDB has three datasets licensed under a CC BY-NC 4.0 License.
release v1.0
Title | Description | Notes | |
---|---|---|---|
2.8GB | accentdb_core | 4 non-native Indian English accents collected by authors. | 6,587 files |
3.9GB | accentdb_extended | Samples for 5 English Accents + 4 accents from accentdb_core. | 19,111 files |
1.3GB | accentdb_raw | Raw and unprocessed recordings for the core dataset. | 11 files |
To play with a smaller AccentDB dataset, we share a classification model described in section 3.1.2. You can experiment with the model and the dataset in a Colab notebook in your browser.
Open in Google Colab
Citation
If you have found our dataset or models to be useful, please cite us as below. Download Bib
1 | @InProceedings{ahamad-anand-bhargava:2020:LREC, |
2 | author = {Ahamad, Afroz and Anand, Ankit and Bhargava, Pranesh}, |
3 | title = {AccentDB: A Database of Non-Native English Accents to Assist Neural Speech Recognition}, |
4 | booktitle = {Proceedings of The 12th Language Resources and Evaluation Conference}, |
5 | month = {May}, |
6 | year = {2020}, |
7 | address = {Marseille, France}, |
8 | publisher = {European Language Resources Association}, |
9 | pages = {5353--5360}, |
10 | url = {https://www.aclweb.org/anthology/2020.lrec-1.659} |
11 | } |
People
- Afroz Ahamad
- Ankit Anand
- Dr. Pranesh Bhargava, Principal Investigator