-
Hi @hbredin! Its me again - the guy with the hypothetical rap music dataset. :D I had a few doubts regarding annotating my own dataset. I am interested in speaker diarization and not speaker tracking. Consider the following scenarios:-
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
|
Beta Was this translation helpful? Give feedback.
For training speaker segmentation models (like voice activity detection or overlapped speech detection), using labels unique at clip level is fine. However, for training speaker embedding models, you should make sure those labels are unique at dataset level.
As long as you do not plan to later distinguish speakers within a class, it should be OK.