Support audio #582

AlekseySh · 2024-06-07T16:08:38Z

Let's add audio to the already supporting modalities like images and texts.

The plan:

Add an analogue of get_mock_images_dataset
Add the corresponding datasets inherited from the corresponding interfaces
Add a sound processing NN model to Zoo, or add a wrapper to 3rd party sound models (like HFWrapper)
Based on the objects above, create train / val example.
Evaluate an example on a dataset having larger scale in order to check the example may be applied on real data.

The text was updated successfully, but these errors were encountered:

amanteur · 2024-06-07T19:11:39Z

Hi, I'll do it!

AlekseySh · 2024-06-07T19:29:14Z

@amanteur What a coincidence!

You are welcome! :)

AlekseySh added the new feature label Jun 7, 2024

AlekseySh added this to To do in OML-planning via automation Jun 7, 2024

AlekseySh changed the title ~~Support sounds~~ Support audio Jun 7, 2024

AlekseySh assigned amanteur Jun 7, 2024

AlekseySh moved this from To do to In progress in OML-planning Jun 8, 2024

amanteur linked a pull request Jun 17, 2024 that will close this issue

Add audio datasets #598

Open

AlekseySh linked a pull request Jun 17, 2024 that will close this issue

Add audio datasets #598

Open

Provide feedback