ScatterMoE to support Quantized PEFT #101

fabianlim · 2024-11-06T00:17:25Z

The PR #99 includes the ScatterMoE module drop-in to enable expert parallel for mixture-of-expert models, but the class currently only supports full-finetuning and LoRA.

This issue is to extend it to also support quantized peft. Currently it is incompatible with the accelerated-peft plugin on many levels

firstly the MoE models may have 3D tensors for experts, not sure if bitsandbytes support it.
since we perform a complete model swap without caring about the base layers, we will ignore the quantized modules when doing the model swap

This requires a think through. The best outcome will be compatiblity with quantized_peft

The text was updated successfully, but these errors were encountered:

fabianlim added help wanted Extra attention is needed question Further information is requested labels Nov 6, 2024

fabianlim mentioned this issue Nov 7, 2024

Add ExpertParallel Mixture-of-Experts Plugin #99

Merged

6 tasks

fabianlim changed the title ~~Allow ScatterMoE to support Quantized PEFT~~ ScatterMoE to support Quantized PEFT Nov 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ScatterMoE to support Quantized PEFT #101

ScatterMoE to support Quantized PEFT #101

fabianlim commented Nov 6, 2024

ScatterMoE to support Quantized PEFT #101

ScatterMoE to support Quantized PEFT #101

Comments

fabianlim commented Nov 6, 2024