Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ScatterMoE to support Quantized PEFT #101

Open
fabianlim opened this issue Nov 6, 2024 · 0 comments
Open

ScatterMoE to support Quantized PEFT #101

fabianlim opened this issue Nov 6, 2024 · 0 comments
Labels
help wanted Extra attention is needed question Further information is requested

Comments

@fabianlim
Copy link
Contributor

The PR #99 includes the ScatterMoE module drop-in to enable expert parallel for mixture-of-expert models, but the class currently only supports full-finetuning and LoRA.

This issue is to extend it to also support quantized peft. Currently it is incompatible with the accelerated-peft plugin on many levels

  • firstly the MoE models may have 3D tensors for experts, not sure if bitsandbytes support it.
  • since we perform a complete model swap without caring about the base layers, we will ignore the quantized modules when doing the model swap

This requires a think through. The best outcome will be compatiblity with quantized_peft

@fabianlim fabianlim added help wanted Extra attention is needed question Further information is requested labels Nov 6, 2024
@fabianlim fabianlim changed the title Allow ScatterMoE to support Quantized PEFT ScatterMoE to support Quantized PEFT Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant