Question about SSTtransformer #7

return-sleep · 2023-08-20T13:32:06Z

Thank you for your released codes. I wonder which predictive model achieve the best performance？

When I run SSTtransformer, I find that there is a big difference between the predictions of model(Train=False) and model(Train=True). May I ask how you solved this problem? Also how did you adjust the ratio for scheduled sampling ?

jerrywn121 · 2023-08-23T09:56:02Z

Thanks for the question. SAConvLSTM achieves the best performance, while STTransformer is the most efficient in terms of training speed. model(train=True) means that we are conditioning on ground truth in decoder (in training mode this is possible while in inference mode this cannot happen). Of course, there is a big difference between train=True and train=False, which is also the reason why we need scheduled sampling. To adjust the scheduled sampling decay rate, you can look at the training score curve and choose a decay rate such that the ssr is not too low at epochs when the training loss is decreasing fast. Make this as a starting point and tune the decay rate according to the offline test set.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about SSTtransformer #7

Question about SSTtransformer #7

return-sleep commented Aug 20, 2023

jerrywn121 commented Aug 23, 2023

Question about SSTtransformer #7

Question about SSTtransformer #7

Comments

return-sleep commented Aug 20, 2023

jerrywn121 commented Aug 23, 2023