Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Draft][PyTorch] Add context parallel support for packed dataset in THD format #9540

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tomlifu
Copy link

@tomlifu tomlifu commented Jun 25, 2024

What does this PR do ?

This PR adds context parallel support for packed dataset in THD format in NeMo in response to this TE PR: NVIDIA/TransformerEngine#641. Currently, the TE PR requires each individual sequence length is divisible by (2*context_parallel_size).

Changes

  • Add support to split packed dataset across different CP ranks in a load balanced way
  • Add necessary paddings to dataset during packing stage to make sure the individual sequence length is a multiple of 2*cp_size

PR Type:

  • New Feature
  • Bugfix
  • Documentation

@github-actions github-actions bot added the NLP label Jun 25, 2024
@tomlifu tomlifu changed the title [PyTorch] Add context parallel support for packed dataset in THD format [Draft][PyTorch] Add context parallel support for packed dataset in THD format Jun 26, 2024
@xrennvidia xrennvidia self-requested a review June 29, 2024 02:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant