Skip to content

Implement DualPipe training for moe model #9

@skydoorkai

Description

@skydoorkai

Is your feature request related to a problem? Please describe.
DualPipe is an efficient pipeline parallel implementation for moe models. This feature request is to implement moe model training using DualPipe .

Describe the solution you'd like
Given an example of moe dualpipe training and put it under examples/moe_dualpipe/
Useful modules or APIs can put in suitable atorch path, such as atorch/modules/moe/

Use All2All for moe token communication first, then consider using deep_ep dispatch/combine for optimization.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Additional context

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions