Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid random weights initialization when quantizing #291

Merged
merged 2 commits into from
Aug 24, 2024

Conversation

dacorvo
Copy link
Collaborator

@dacorvo dacorvo commented Aug 23, 2024

What does this PR do?

As raised by @latentCall145, there is a useless random weights initialization when quantizing a module.
The solution suggested in #290 is correct but makes the low-level quantization API depend on accelerate, which is only an optional dependency used by the high-level model API.

This is more or less the same implementation, but more explicitly using the meta device. Note that we need to explicitly preserve the scales, since unlike accelerate, pytorch does not distinguish between parameters and buffers when skipping initialization.

The device parameter is added to qcreate, and scale buffers are created
on the same device as the weights.
@dacorvo dacorvo requested review from sayakpaul and SunMarc August 23, 2024 15:55
@dacorvo dacorvo merged commit a1c310b into main Aug 24, 2024
16 checks passed
@dacorvo dacorvo deleted the avoid_random_weights_quantize branch August 24, 2024 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant