Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory problems when loading audio dict #20

Open
axelstram opened this issue Mar 4, 2021 · 3 comments
Open

Memory problems when loading audio dict #20

axelstram opened this issue Mar 4, 2021 · 3 comments

Comments

@axelstram
Copy link

axelstram commented Mar 4, 2021

Hello,

As part of my M.Sc. thesis, I'm trying to train the model on EPIC-KITCHENS 100 from scratch, but when the script starts loading the audio dict, the process stops and eventually dies. We figured out that it was because it was filling all our RAM (64 GB). Is there any way around this? Because loading the .wav files directly from disk is extremely slow. How much RAM and GPU's did you have when you trained the model?

Thanks!

@ekazakos
Copy link
Owner

ekazakos commented Mar 5, 2021

Hi,
Thanks for your interest in my code! I trained the models in 8 Tesla V100 GPUs and I had 500GB of RAM. One workaround to your problem is to save the extracted audio in HDF5 format using h5py instead of a python dictionary. HDF5 reads from disk instead of loading everything in memory so will not be getting memory issues using it.

I hope this helps.

@axelstram
Copy link
Author

Hi, thanks for the response!

If I understand correctly, what you are saying is that after saving the audio dict in h5py format, I have to replace line 27 in dataset.py (self.audio_path = pickle.load(open(audio_path, 'rb'))) for it's equivalent in h5py? In that case, I did that using a replacement for pickle called hickle for saving the dict, but the script still tries to load everything into memory.
The other option I thought was saving every .wav file separately in h5py, set use_audio_dict to False, and then grab the compressed files instead of the .wavs and pass them to librosa, but librosa apparently cannot load a file that is already in memory.

@ekazakos
Copy link
Owner

ekazakos commented Mar 6, 2021

That is correct. You would have to replace line 27 with something like self.audio = h5py.File(audio_path, 'r'). I haven't tried hickle but I'm quite sure that h5py won't try to load everything in memory, it will read from disk. In h5py you should save the audio numpy vector loaded for example with librosa.core.loadas done in

samples, sample_rate = librosa.core.load(os.path.join(root, file),
but instead of saving to dict you save it to h5py, and you create one h5py dataset per audio as the audio from different videos has variable length. So it should be something like:

f=h5py.File(audio_path,'w')
for audio_file_name in audio_file_names:
    samples, sample_rate = librosa.core.load(audio_file_name, sr=None, mono=False)
    f.create_dataset(audio_file_name, data=samples)

given that you extracted the audio from videos with my script that resamples to 24kHz and converts the audio to mono, otherwise you use samples, sample_rate = librosa.core.load(audio_file_name, sr=24000, mono=True) in the above for-loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants