This GitHub repository summarizes most recent papers and resources related to the model fusion / model merging.
If you have any suggestions about this repository, please feel free to start a new issue or pull requests.
- Fisher-Mering: Merging Models with Fisher-Weighted Averaging [Paper]
- [ICLR23] RegMean: Dataless Knowledge Fusion by Merging Weights of Language Models [Paper] [Code]
- [ICML22] Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time [Paper] [Code]
- [ICLR23] Git Re-Basin: Merging Models modulo Permutation Symmetries [Paper]
- [ICLR24] ZipIt! Merging Models from Different Tasks without Training [Paper] [Code]
- REPAIR: REnormalizing Permuted Activations for Interpolation Repair [Paper] [Code]
- [ICLR23] Editing Models with Task Arithmetic [Paper] [Code]
- [NIPS23] Composing Parameter-Efficient Modules with Arithmetic Operations [Paper] [Code]
- [NIPS23] Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models [Paper] [Code]
- [ICLR24] Parameter Efficient Multi-task Model Fusion with Partial Linearization [Paper]
- Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic
- [ICLR24] AdaMerging: Adaptive Model Merging for Multi-Task Learning [Paper] [Code]
- MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation [Paper]
- [ICML24] DARE: Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch [Paper] [Code]
- [ICML24] Representation Surgery for Multi-Task Model Merging [Paper] [Code]
- [ICML24] Localizing Task Information for Improved Model Merging and Compression [Paper] [Code]
- EMR-Merging: Tuning-Free High-Performance Model Merging [Paper] [Code]
- [ICML23] Exploring the Benefits of Training Expert Language Models over Instruction Tuning [Paper] [Code]
- [ICLR24] FOE: Fusing Models with Complementary Expertise [Paper] [Code]
- [ICLR24] FuseLLM: KNOWLEDGE FUSION OF LARGE LANGUAGE MODELS [Paper] [Code]
- [ICML24] Merging Multi-Task Models via Weight-Ensembling Mixture of Experts [Paper] [Code]
- [ICML24] Learning to Route Among Specialized Experts for Zero-Shot Generalization [Paper] [Code]
- Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM [Paper]
- Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization [Paper] [Code]
- [ICLR24] Controlled Text Generation via Language Model Arithmetic [Paper] [Code]
- [ICLR24] An Emulator for Fine-Tuning Large Language Models using Small Language Models [Paper]
- [NIPS23] ForkMerge: Mitigating Negative Transfer in Auxiliary-Task Learning [Paper]