You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The default parameter for training is different from the pre-trained model in the repo.
For the default setting, it has 229 mel bins (as same as the paper), but the pre-trained model has 300 mel bins. Also, f_min and f_max value are different. Also I found that the pre-trained model has one more conv layer in the PreConvSpec. Does this change have a meaningful change on the performance?
Also, when I tried the training (once with the default parameter, and the other with the pre-trained model parameter), both cases shows much lower performance than the pre-trained model (0.7403 for valid F1) and the score reported in the paper. I think the only difference is the batch size, which is 12 in the paper and 2 in the default parameter. Have you ever trained the model with batch size 2 or trained the model with the default parameter in this repo?
Again, thank you very much for sharing your code! 😁
The text was updated successfully, but these errors were encountered:
I do not see those parameter matters, as far as I know. But this view may be biased as a larger range may accommodate more harmonics for seldomly played high notes.
a. The curve looks under-fitting. The number of training steps should be increased accordingly. The code is using an aggressive learning rate scheduler (OneCircle) which requires the total number of iterations known beforehand. It's probably better to switch to a constant learning rate scheduler to train longer (e.g. the famous default 1e-4 learning rate) in practice if we don't know how much steps are enough.
b. The architecture uses batch norm, which is also notorious for small batch training. That's why I have not even tried small batch like that.
BTW, the spikes in train_f1 and others suggest you may have some issue in dataset. Have you checked the sampling rate of the data?
Hello, thank you for the valuable code sharing!
I have several questions about the code.
The default parameter for training is different from the pre-trained model in the repo.
For the default setting, it has 229 mel bins (as same as the paper), but the pre-trained model has 300 mel bins. Also, f_min and f_max value are different. Also I found that the pre-trained model has one more conv layer in the PreConvSpec. Does this change have a meaningful change on the performance?
Also, when I tried the training (once with the default parameter, and the other with the pre-trained model parameter), both cases shows much lower performance than the pre-trained model (0.7403 for valid F1) and the score reported in the paper. I think the only difference is the batch size, which is 12 in the paper and 2 in the default parameter. Have you ever trained the model with batch size 2 or trained the model with the default parameter in this repo?
Again, thank you very much for sharing your code! 😁
The text was updated successfully, but these errors were encountered: