There is a problem about training a conformer+RNN-T model? #38

scufan1990 · 2021-12-15T13:16:26Z

Hi,
There is a problem about training a conformer+RNN-T model.
How about the cer and wer with one GPU?

I'm train the model on one RTX TITAN GPU, training the conformer(encoder layers 16, encoder dim 144, decoder layer 1, decoder dim 320), after 50 epoch training the CER is about 27 and don't reduce anymore.

wszyy · 2022-01-18T15:05:33Z

Hello, I meet the same problem as you, but I use the Conformer Encoder and Transformer Decoder. By the way, do you solve the problem about the output of DecoderRNNT? It's combined with 4 dimensions, how to use it to recognize speech?

jingzhang0909 · 2022-01-19T04:14:05Z

Could you tell me what dataset you use in your training？How long it would use to train a ckpt? I find dataset Librispeech with 970 hours in paper. It seems that will cost a lot of time in training.

wszyy · 2022-01-19T04:51:45Z

Um, I use the aishell-1, training beyond 10 hours, but the effects is not very well. Actually, I use the Google Colab to train the model, it really takes a lot of time.
By the way, do you understand the 4 dimensions results? The auther just use torch.cat to connect the encoder_output matrix and decoder_output matrix, it seems that the network can not be used to recognize speech.
So, I build two networks:
1、Conformer's encoder and Transformer's decoder
2、Conformer's encoder and LSTM decoder with attention mechanism.
Now, I have been training the two network for several days.

jingzhang0909 · 2022-01-19T07:13:48Z

Um, I use the aishell-1, training beyond 10 hours, but the effects is not very well. Actually, I use the Google Colab to train the model, it really takes a lot of time. By the way, do you understand the 4 dimensions results? The auther just use torch.cat to connect the encoder_output matrix and decoder_output matrix, it seems that the network can not be used to recognize speech. So, I build two networks: 1、Conformer's encoder and Transformer's decoder 2、Conformer's encoder and LSTM decoder with attention mechanism. Now, I have been training the two network for several days.

Thanks for your reply! I have not decide the model and dataset which to use yet. I would like to share with you if there is some futher info.

wszyy · 2022-01-19T07:17:08Z

That will be OK, I'm also need to communicate with other to know more about the network. Do you come from china? Maybe we can change the contact.

wanglongR · 2022-12-10T01:53:12Z

That will be OK, I'm also need to communicate with other to know more about the network. Do you come from china? Maybe we can change the contact.

hello wszyy，I come from China. I have been learning about conformer's model recently and would like to communicate with you about it. If you are willing, you can add my wechat, ID: scrushy518

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There is a problem about training a conformer+RNN-T model? #38

There is a problem about training a conformer+RNN-T model? #38

scufan1990 commented Dec 15, 2021

wszyy commented Jan 18, 2022

jingzhang0909 commented Jan 19, 2022 •

edited

Loading

wszyy commented Jan 19, 2022

jingzhang0909 commented Jan 19, 2022

wszyy commented Jan 19, 2022 •

edited

Loading

wanglongR commented Dec 10, 2022

There is a problem about training a conformer+RNN-T model? #38

There is a problem about training a conformer+RNN-T model? #38

Comments

scufan1990 commented Dec 15, 2021

wszyy commented Jan 18, 2022

jingzhang0909 commented Jan 19, 2022 • edited Loading

wszyy commented Jan 19, 2022

jingzhang0909 commented Jan 19, 2022

wszyy commented Jan 19, 2022 • edited Loading

wanglongR commented Dec 10, 2022

jingzhang0909 commented Jan 19, 2022 •

edited

Loading

wszyy commented Jan 19, 2022 •

edited

Loading