Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support audio #582

Open
AlekseySh opened this issue Jun 7, 2024 · 2 comments · May be fixed by #598
Open

Support audio #582

AlekseySh opened this issue Jun 7, 2024 · 2 comments · May be fixed by #598
Assignees

Comments

@AlekseySh
Copy link
Contributor

AlekseySh commented Jun 7, 2024

Let's add audio to the already supporting modalities like images and texts.

The plan:

  • Add an analogue of get_mock_images_dataset
  • Add the corresponding datasets inherited from the corresponding interfaces
  • Add a sound processing NN model to Zoo, or add a wrapper to 3rd party sound models (like HFWrapper)
  • Based on the objects above, create train / val example.
  • Evaluate an example on a dataset having larger scale in order to check the example may be applied on real data.
@AlekseySh AlekseySh added this to To do in OML-planning via automation Jun 7, 2024
@AlekseySh AlekseySh changed the title Support sounds Support audio Jun 7, 2024
@amanteur
Copy link
Contributor

amanteur commented Jun 7, 2024

Hi, I'll do it!

@AlekseySh
Copy link
Contributor Author

@amanteur What a coincidence!

You are welcome! :)

@AlekseySh AlekseySh moved this from To do to In progress in OML-planning Jun 8, 2024
@amanteur amanteur linked a pull request Jun 17, 2024 that will close this issue
@AlekseySh AlekseySh linked a pull request Jun 17, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
OML-planning
In progress
Development

Successfully merging a pull request may close this issue.

2 participants