Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about SSTtransformer #7

Open
return-sleep opened this issue Aug 20, 2023 · 1 comment
Open

Question about SSTtransformer #7

return-sleep opened this issue Aug 20, 2023 · 1 comment

Comments

@return-sleep
Copy link

Thank you for your released codes. I wonder which predictive model achieve the best performance?

When I run SSTtransformer, I find that there is a big difference between the predictions of model(Train=False) and model(Train=True). May I ask how you solved this problem? Also how did you adjust the ratio for scheduled sampling ?

@jerrywn121
Copy link
Owner

Thanks for the question. SAConvLSTM achieves the best performance, while STTransformer is the most efficient in terms of training speed. model(train=True) means that we are conditioning on ground truth in decoder (in training mode this is possible while in inference mode this cannot happen). Of course, there is a big difference between train=True and train=False, which is also the reason why we need scheduled sampling. To adjust the scheduled sampling decay rate, you can look at the training score curve and choose a decay rate such that the ssr is not too low at epochs when the training loss is decreasing fast. Make this as a starting point and tune the decay rate according to the offline test set.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants