Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confused about iterations #25

Open
junkangwu opened this issue Mar 13, 2024 · 4 comments
Open

Confused about iterations #25

junkangwu opened this issue Mar 13, 2024 · 4 comments

Comments

@junkangwu
Copy link

Hi there, great job on the project!

I'm looking to clarify whether the UCLA-AGI/zephyr-7b-sft-full-SPIN-iter1 model was fine-tuned on top of UCLA-AGI/zephyr-7b-sft-full-SPIN-iter0 or alignment-handbook/zephyr-7b-sft-full. The paper suggests that training progresses from $\theta_t$ to $\theta_{t+1}$. However, the description provided at https://huggingface.co/UCLA-AGI/zephyr-7b-sft-full-SPIN-iter1 seems to indicate otherwise.

I would appreciate any clarification on this matter.

Thank you!

@angelahzyuan
Copy link
Collaborator

Hi there, great job on the project!

I'm looking to clarify whether the UCLA-AGI/zephyr-7b-sft-full-SPIN-iter1 model was fine-tuned on top of UCLA-AGI/zephyr-7b-sft-full-SPIN-iter0 or alignment-handbook/zephyr-7b-sft-full. The paper suggests that training progresses from θt to θt+1. However, the description provided at https://huggingface.co/UCLA-AGI/zephyr-7b-sft-full-SPIN-iter1 seems to indicate otherwise.

I would appreciate any clarification on this matter.

Thank you!

Hi, thank you!

We haver released the full training pipeline in our repo, "Reproducing our results" section. Iter1 is trained with iter0 as the base model. The description provided at https://huggingface.co/UCLA-AGI/zephyr-7b-sft-full-SPIN-iter1 is meant to indicate that it is fine-tuned by starting from zephyr-7b-sft-full and running 2 iterations.

Feel free to let us know if there are any other questions

@junkangwu
Copy link
Author

@angelahzyuan
Your configuration for the training epochs appears to be somewhat disorganized. Within your configs/config_iter1.yaml file, I observe the num_train_epochs parameter is set to 3, yet in conversation you mentioned it to be 2. Moreover, in the scripts/finetune_iter1.sh script, the parameter num_train_epochs is explicitly set to 6. There seems to be an inconsistency that warrants clarification.

For the sake of ensuring precise experimental protocols, it is imperative that we align the epoch configuration across all interfaces and documentation. Would you kindly specify which is the definitive setting? This concise and coherent alignment is essential for the reproducibility of the experiment and for any ensuing analysis to be founded on a consistent experimental setup.

@angelahzyuan
Copy link
Collaborator

angelahzyuan commented Apr 28, 2024

Hi @junkangwu , thanks for your follow-up question. In all iterations, the num_train_epochs parameter is set to 6. This setting is enforced explicitly in the training script. For instance, in scripts/finetune_iter2.sh, you'll find that num_train_epochs is explicitly set to 6 as well. Additionally, I want to clarify that in our previous conversation, when I mentioned "iter1 starts from zephyr-7b-sft-full and running 2 iterations," I was referring to iterations, not epochs. Finally, in addition to num_train_epochs, which influences the learning rate schedule, the checkpoint selected to proceed to the next iteration is determined by model selection. Based on our experience, we've found that 2 epochs is typically a safe choice for this. We'll update the configuration file to make it more clear

@junkangwu
Copy link
Author

@angelahzyuan Thank you for your detailed response. So, to ensure I understand correctly, num_train_epochs is set to 6, but the checkpoint selected to proceed to the next iteration is at epoch 2. Is my understanding accurate? If so, why not set num_train_epochs directly to 2 instead of 6? I have two interpretations, and I would appreciate it if you could clarify where my understanding may be incorrect:

  • num_train_epochs is set to 6, but the model used for generating negative samples for the next iteration is the one from the checkpoint at epoch=2.
  • num_train_epochs is set to 6, but the model used for initializing the reference for the next iteration is the one from the checkpoint at epoch=2.

Could you please clarify which interpretation, if any, is correct, or point out where I might be misunderstanding?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants