Skip to content

[Shardformer] change qwen2 modeling into gradient checkpointing style #7821

[Shardformer] change qwen2 modeling into gradient checkpointing style

[Shardformer] change qwen2 modeling into gradient checkpointing style #7821