Skip to content

Fully convolutional speech-to-text model based on Facebook's Wav2Letter. Developed alongside Andrew Schallwig and Matt Palazzolo for EECS 442 at the University of Michigan.

Notifications You must be signed in to change notification settings

Aditya-Singhvi/wav442letter

Repository files navigation

wav442letter

Fully convolutional speech-to-text model based on Facebook's Wav2Letter. Developed alongside Andrew Schallwig and Matt Palazzolo for EECS 442 at the University of Michigan.

The original paper can be found here.

Our results are summarized below, with Facebook's original results on the left and ours on the right. Our goal was to try to replicate Facebook's results with far fewer computational resources; although clearly not successful, we certainly achieved a decent approximation given that we used 0.3% of the training data and 30% of the trainable parameters of the original model.

Screen Shot 2022-12-18 at 00 20 53

The model was built in PyTorch and trained on the dev-clean subset of the LibriSpeech ASR corpus, available here.

About

Fully convolutional speech-to-text model based on Facebook's Wav2Letter. Developed alongside Andrew Schallwig and Matt Palazzolo for EECS 442 at the University of Michigan.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages