Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support DPO, ORPO and Reward Model #743

Merged
merged 46 commits into from
Jun 13, 2024

Commits on Jun 11, 2024

  1. Support reward model and dpo

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    9f2e35e View commit details
    Browse the repository at this point in the history
  2. support train reward model

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    988fcaa View commit details
    Browse the repository at this point in the history
  3. fix config

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    28663c1 View commit details
    Browse the repository at this point in the history
  4. fix lint

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    96d6b00 View commit details
    Browse the repository at this point in the history
  5. fix lint

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    0015b42 View commit details
    Browse the repository at this point in the history
  6. support jsonl dataset

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    f8353d8 View commit details
    Browse the repository at this point in the history
  7. feat: support ORPO

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    f03dbc5 View commit details
    Browse the repository at this point in the history
  8. reorg configs

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    e5c52a6 View commit details
    Browse the repository at this point in the history
  9. rename collate function

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    805bd5a View commit details
    Browse the repository at this point in the history
  10. rename collate function

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    830cab5 View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    1212f19 View commit details
    Browse the repository at this point in the history
  12. fix lint

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    adee459 View commit details
    Browse the repository at this point in the history
  13. fix lint

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    c042c55 View commit details
    Browse the repository at this point in the history
  14. rebase main

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    08483c7 View commit details
    Browse the repository at this point in the history
  15. update

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    6d3f1ec View commit details
    Browse the repository at this point in the history
  16. Configuration menu
    Copy the full SHA
    b2589e8 View commit details
    Browse the repository at this point in the history
  17. inherit sft

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    00a8d82 View commit details
    Browse the repository at this point in the history
  18. fix broadcast

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    6c43a43 View commit details
    Browse the repository at this point in the history
  19. fix nan loss skip

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    4d0c96d View commit details
    Browse the repository at this point in the history
  20. support reward model sp

    HIT-cwh authored and RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    bfead3b View commit details
    Browse the repository at this point in the history
  21. support dpo sp

    HIT-cwh authored and RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    5aafdb9 View commit details
    Browse the repository at this point in the history
  22. support orpo sp

    HIT-cwh authored and RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    c571a70 View commit details
    Browse the repository at this point in the history
  23. fix bugs

    HIT-cwh authored and RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    4a79d2b View commit details
    Browse the repository at this point in the history
  24. fix rebase

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    3d26ad2 View commit details
    Browse the repository at this point in the history
  25. convert script

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    9afed8c View commit details
    Browse the repository at this point in the history
  26. fix precommit

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    aba3646 View commit details
    Browse the repository at this point in the history
  27. mv convert script to model

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    776037d View commit details
    Browse the repository at this point in the history
  28. fix version check

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    06004fd View commit details
    Browse the repository at this point in the history
  29. fix import

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    e990be3 View commit details
    Browse the repository at this point in the history
  30. add comments of reward token

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    2952593 View commit details
    Browse the repository at this point in the history
  31. fix orpo cfg

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    036e7f7 View commit details
    Browse the repository at this point in the history
  32. fix lint

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    aeaa98c View commit details
    Browse the repository at this point in the history
  33. fix lint

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    114e5e3 View commit details
    Browse the repository at this point in the history
  34. remove seed

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    a885ade View commit details
    Browse the repository at this point in the history
  35. remove seed

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    582e02f View commit details
    Browse the repository at this point in the history
  36. add sp config

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    a26dcd9 View commit details
    Browse the repository at this point in the history
  37. add reward sp config

    RangiLyu committed Jun 11, 2024
    Configuration menu
    Copy the full SHA
    e9000b0 View commit details
    Browse the repository at this point in the history

Commits on Jun 12, 2024

  1. fix convert

    RangiLyu committed Jun 12, 2024
    Configuration menu
    Copy the full SHA
    fb656d7 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    6ca9ad9 View commit details
    Browse the repository at this point in the history
  3. fix qlora reward merge

    RangiLyu committed Jun 12, 2024
    Configuration menu
    Copy the full SHA
    1a28acd View commit details
    Browse the repository at this point in the history
  4. update dpo loss

    RangiLyu committed Jun 12, 2024
    Configuration menu
    Copy the full SHA
    7a978fc View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    c74721c View commit details
    Browse the repository at this point in the history
  6. update logits mask

    RangiLyu committed Jun 12, 2024
    Configuration menu
    Copy the full SHA
    5c711ae View commit details
    Browse the repository at this point in the history
  7. unpack logits first

    RangiLyu committed Jun 12, 2024
    Configuration menu
    Copy the full SHA
    88ec30c View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    ddf5fa4 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    9d60565 View commit details
    Browse the repository at this point in the history