Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The warmup model for the doc datasat #18

Open
staoxiao opened this issue Aug 26, 2021 · 9 comments
Open

The warmup model for the doc datasat #18

staoxiao opened this issue Aug 26, 2021 · 9 comments

Comments

@staoxiao
Copy link

Hi, jingtao

can you share the warmup model for the doc ranking data?

@jingtaozhan
Copy link
Owner

You can use the provided STAR checkpoint trained on the passage dataset as a start.
Use the bm25 negatives as the static hard negatives and train for several epochs. You will get MRR@100 about 0.36.
Then retrieve static hard negatives and train for another several epochs, you will get 0.39.

@ikuyamada
Copy link

Hi @jingtaozhan,

Thank you for sharing the code!
I would like to know the exactly the same procedure used in the experiments with MS MARCO Doc dataset. The paper mentions as follows:

STAR uses the BM25 Neg model as the warm-up model, which is the same as ANCE and hence their results are directly comparable

How can I obtain the BM25 Neg model used in the experiments using the MS MARCO Doc dataset?

Thank you in advance!

@jingtaozhan
Copy link
Owner

Thank you for your great question @ikuyamada
In fact, this also bothered me when I was conducting experiments about ANCE. I consulted ANCE's authors and they told me the warmup model for doc dataset is the finetuned ANCE model on passage ranking dataset.
Therefore, I did the same when I trained my models on the doc dataset. I used the STAR model trained on passage dataset as initialization. First, I used BM25 top negatives and got about 0.36 MRR@100 iirc. Then, I use our proposed STAR and ADORE to train the models and the detailed hyper parameters are listed in the paper.
Hope this can answer your question.

@ikuyamada
Copy link

ikuyamada commented Sep 7, 2021

Thank you very much for your prompt answer!

I found the train data on the left of the STAR row in the Doc Retrieval table. Is it the data used to generate negatives in the STAR experiment?
If not, do you provide the training data or the checkpoint file used to generate negatives for the STAR and ADORE models?
I think it would be very helpful for further research based on MS MARCO Doc dataset.

@jingtaozhan
Copy link
Owner

The train data is the retrieval results of STAR on training queries. It is not the one I used to generate negatives.
ADORE retrieves negatives during training and does not sample negatives in advance.
It is a good idea to also release the BM25 Neg model on doc dataset. It's been a long time, so I need to find it and run some tests. I think it will be added by the end of this week and will let you know when it's ready @ikuyamada @staoxiao .

@ikuyamada
Copy link

ikuyamada commented Sep 7, 2021

It is a good idea to also release the BM25 Neg model on doc dataset. It's been a long time, so I need to find it and run some tests.

Thanks so much! Looking forward to the release of the model!

@jingtaozhan
Copy link
Owner

@ikuyamada
Hi, quite busy these days and very sorry for the delay. I found the model but didn't have time to evaluate it. I already uploaded it and you can download from this link https://www.dropbox.com/s/s7naszn5figf706/pytorch_model.bin?dl=0 . The evaluation procedure should be exactly the same as STAR. Could you please tell me the MRR@100 score once you evaluate it ? So I can know whether the checkpoint is the correct one. It should be about 0.36.
Thank you!

@ikuyamada
Copy link

@jingtaozhan Thank you for making the model available! I will evaluate the model and get back to you soon!

@isuco
Copy link

isuco commented Apr 8, 2022

@ikuyamada Hi, quite busy these days and very sorry for the delay. I found the model but didn't have time to evaluate it. I already uploaded it and you can download from this link https://www.dropbox.com/s/s7naszn5figf706/pytorch_model.bin?dl=0 . The evaluation procedure should be exactly the same as STAR. Could you please tell me the MRR@100 score once you evaluate it ? So I can know whether the checkpoint is the correct one. It should be about 0.36. Thank you!

Hi @jingtaozhan, thanks for the code and the checkpoints. I hope to apply it to my future work but I got into trouble when replicating the BM25 Neg model on the doc dataset. I fine-tuned the provided passage-star checkpoint with the official BM25 top100 dataset, but I only got 0.33 MRR@100 after 50K training steps. Could you share the doc BM25 negatives and the script to replicate the BM25 Neg model? Many thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants