You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your work. We're re-evaluating experiments using an updated SFT ckpt from https://huggingface.co/alignment-handbook/zephyr-7b-sft-full and using lm-evaluation-harness v0.4.0 for evaluation. We've noticed a significant performance drop in GSM8k. We trained the model for 6 epochs in each iteration. Have you observed this issue or have insights into potential causes?
The text was updated successfully, but these errors were encountered:
It could be related to the version of lm-evaluation-harness. For more details, see #12 (comment).
Additionally, after updating the SFT checkpoint from https://huggingface.co/alignment-handbook/zephyr-7b-sft-full, the relative improvement between iteration 0 and iteration 1 appears to be marginal. Are there any new parameter settings being recommended?
I use lm-evaluation-harness v0.4.0 for evaluation, which is consistent with the evaluation version used by the author. In addition, the results displayed above are obtained using num_train_epochs=6 for training.
Hi,
Thank you for your work. We're re-evaluating experiments using an updated SFT ckpt from https://huggingface.co/alignment-handbook/zephyr-7b-sft-full and using lm-evaluation-harness v0.4.0 for evaluation. We've noticed a significant performance drop in GSM8k. We trained the model for 6 epochs in each iteration. Have you observed this issue or have insights into potential causes?
![image](https://private-user-images.githubusercontent.com/34408423/312257333-eb0c2031-a887-4378-af39-bf71f1f96fb5.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjEyNjkzODAsIm5iZiI6MTcyMTI2OTA4MCwicGF0aCI6Ii8zNDQwODQyMy8zMTIyNTczMzMtZWIwYzIwMzEtYTg4Ny00Mzc4LWFmMzktYmY3MWYxZjk2ZmI1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA3MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNzE4VDAyMTgwMFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTZlZGMwZTQ1OGRiNTA4MWMyMjc2MDUyYTk1NDBlMTY5Y2I3OWEyYzU2NmU4NWY1OTg1NTg2Y2ZlOTg1Y2NjMzcmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.-C0-4bVa9r81dtjjybgx49CO2IIY3Ay_CZ7Forekz88)
The text was updated successfully, but these errors were encountered: