Repo for implementation of keyword-based ASR system
- create virtual enviroment and install requirements from
requirements.txt NOTE: NeMo toolkit is not supported on Windows, so WSL or UNIX-based OS is required, see the docs or github- minimal, necessary data is already in the repo, but to reproduce training process and / or test other keywords you need to download full datasets from this link (if the link doesn't work, please contact me via email:
[email protected]) and put them inDatadirectory (check Data README for the structure) - modify config file to match your setup (all paths with suffix
DATA_DIRshould be changed to match your setup)
- Data - directory with data
- Utils - directory containing utility scripts and config files
- Models - directory containing trained models' weights
- notebooks in root directory
- main
demo.ipynb- notebook demonstrating dual-model keyword-based speaker recognition systemspeaker-recognition.ipynb- notebook with speaker recognition model training and evaluationkeyword-recognition.ipynb- notebook with keyword spotting model demonstration and evaluation
- suplementary
get-data.ipynb- check audio files metadatavisualize-spectrograms.ipynb- visualize spectrograms of audio filesplay-sound.ipynb- sanity-check audio files
- main