How to train with large dataset #196

Bach1502 · 2021-10-02T18:05:49Z

Hello,
I believe that this is a fairly simple question but since I'm very new to ML in general, it still baffles me. I just followed the training instruction and has successfully trained my model on one pair of data (a clean speech.wav and a noise.wav) now I want to ask how can you repeat this process for larger dataset, I'm currently having a set of data with 300 files for these 2 categories and I don't think repeating this process 300 times is the way I should go.

Thanks.

Zadagu · 2021-10-12T14:42:18Z

just concatenate the audio files.
But you need to be aware, that the input format is not .wav it's plain pcm without any header.

Bach1502 · 2021-10-12T14:51:09Z

thank you, I will try it to see if it works

ZihCode · 2022-08-04T08:49:39Z

I want to know how to concatenate the audio files. Did you use any useful tools？Or just copy the RAW files and paste them into one file? How can I get a long RAW data? I would be very grateful if you could help me

Zadagu · 2022-08-04T09:18:39Z

I wrote a python script to concatenate files. For reading audio files I used the soundfile package and resampled if needed using scipy.

Zadagu · 2022-08-09T12:27:56Z

Sorry, but I think your behavior in the GitHub issues is somewhat inappropriate.
You spammed the very same question three times across multiple issues:
#208
#201 (comment)
#196
You can answer your question yourself, by reading the rnnoise paper and newer speech enhancement papers.
They all report numbers on how much data they are using.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to train with large dataset #196

How to train with large dataset #196

Bach1502 commented Oct 2, 2021

Zadagu commented Oct 12, 2021

Bach1502 commented Oct 12, 2021

ZihCode commented Aug 4, 2022

Zadagu commented Aug 4, 2022

Zadagu commented Aug 9, 2022

How to train with large dataset #196

How to train with large dataset #196

Comments

Bach1502 commented Oct 2, 2021

Zadagu commented Oct 12, 2021

Bach1502 commented Oct 12, 2021

ZihCode commented Aug 4, 2022

Zadagu commented Aug 4, 2022

Zadagu commented Aug 9, 2022