This is the first release of FMS Model Optimizer. It provides the core functionality:
- Python API to enable model quantization: With the addition of a few lines of codes, module-level and/or function-level operations replacement will be performed.
- Robust: Verified for INT 8/4-bit quantization on important vision/speech/NLP/object detection/LLMs.
- Flexible: Options to analyze the network using PyTorch Dynamo, apply best practices, such as clip_val initialization, layer-level precision setting, optimizer param group setting, etc. during quantization.
- State-of-the-art INT and FP quantization techniques for weights and activations, such as SmoothQuant, SAWB+ and PACT+.
- Supports key compute-intensive operations like Conv2d, Linear, LSTM, MM and BMM
What's Changed
- Initial setup by @tharapalanivel in #1
- Initial commit for optimization techniques by @tharapalanivel in #9
- Add dynamic build versioning by @hickeyma in #12
- [ci]: Restructure GitHub workflows by @hickeyma in #13
- Clear notebook output by @tharapalanivel in #15
- Improve README readability by @tharapalanivel in #19
- Change project name to correspond to pypi package name by @hickeyma in #18
- Set smoothq_alpha as buffer by @andrea-fasoli in #20
- Fix device for smoothquant activation scales by @andrea-fasoli in #21
- test: Add checks for unit tests that require Nvidia GPU by @hickeyma in #14
- tox: Add base Python version to tox environment by @hickeyma in #24
- Fix symmetric behavior (issue #22) by @andrea-fasoli in #26
- ci: Add Ruff for lint and code formatting by @hickeyma in #30
- Update pre-commit requirement from <4.0,>=3.0.4 to >=3.0.4,<5.0 by @dependabot in #16
- doc: Update dev env section of the contributing guide by @hickeyma in #29
New Contributors
- @tharapalanivel made their first contribution in #1
- @hickeyma made their first contribution in #12
- @andrea-fasoli made their first contribution in #20
- @dependabot made their first contribution in #16
Full Changelog: https://github.com/foundation-model-stack/fms-model-optimizer/commits/v0.2.0