Skip to content

Implementation of Wave-U-net for the Indian Carnatic Music (Saraga Dataset). The Saraga Dataset is evaluated and studied on wave-u-net on different aspects.

Notifications You must be signed in to change notification settings

its-rajesh/Wave-U-Net

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WaveUnet Implementation for Saraga Dataset (Indian Carnatic Music)

Actual Network: https://github.com/f90/Wave-U-Net-Pytorch. We have used this as a baseline and restructured it to use the appropriate dataset.

Saraga Carnatic Dataset:

It has five stems: Mixed, Vocal, Violin, Mrindangam right, and Mrindangam Left. Converting Mridangam left and right into a single audio file (mridangam) Expecting Four stem outputs, namely: Vocal, violin, mridangam and others

Version 1: 4 Sources (Leakage)

The dataset is trained to extract stems: Mrindangam Left, Mrindangam Right, Vocal and Violin.

Results:

Metrics With Bleeding Effects
SDR -0.19096690404060424

Version 2: 3 Sources (Leakage)

The dataset is trained to extract Three stems: Mridangam, Vocal and Violin. This is with some minor changes in the code in data loading. The mridangal left and right are added together. Where ever there is a secondary vocal, it is added to the primary vocal. Ghatam files are removed.

Results:

Metrics With Bleeding Effects
SDR 1.166956417870889

Version 3: 3 Sources (Leakage Removed)

The dataset is trained to extract stems: Mridangam (left+right), Vocal(/s) and Violin, Bleeding of the sources is reduced considerably to achieve higher performance.

Results:

Metrics With Bleed Without Bleed
SDR 1.166956417870889

Standard MUSDB18HQ

MUSDB18HQ is internally artificially bled to evaluate the performance of the effect of leakage. The actual source is maintained dominant and other sources are bled to it with a volume reduction of 10dB. Trained on PyTorch: Wave-U-Net network.

Results:

Metrics Actual With Bleed After Bleed Removal
SDR 2.309013108265547 0.9656615837211856 1.729928257040701

About

Implementation of Wave-U-net for the Indian Carnatic Music (Saraga Dataset). The Saraga Dataset is evaluated and studied on wave-u-net on different aspects.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages