Quantized op implementation: quantized_decomposed::dequantize_per_channel_group #7676

sheetalarkadam · 2025-01-15T19:08:09Z

🚀 The feature, motivation and pitch

The op quantized_decomposed::dequantize_per_channel_group implementation would help in using executorch's quantized CPU ops in models like llama3

Came across this issue when trying to build llama3 without XNNPack. More details here #6975

Alternatives

No response

Additional context

No response

RFC (Optional)

No response

The text was updated successfully, but these errors were encountered:

mcr229 · 2025-01-15T19:50:54Z

@kimishpatel @manuelcandales is this something we are planning on supporting eventually?

kimishpatel · 2025-01-16T17:44:43Z

Can you say more on the use case? I agree in general that this is useful to add. But it is likely gonna be slower without xnnpack so just trying to get more context to figure out how to prioritize this

sheetalarkadam · 2025-01-16T17:50:02Z

I can't think of production usecases as of now. Currently, we are trying measure how much of a boost xnnpack gives. If the performance boost significant, it makes sense to migrate workflows not using xnnpack to use them. But I understand if it will be a lower priority task. My usecase mainly is for getting stats around performance.

mcr229 · 2025-01-16T18:27:20Z

@sheetalarkadam generally performance without xnnpack will be bad. It's generally not a fair comparison between no-xnnpack --> xnnpack, because XNNPACK is the primary cpu accelerator.

mcr229 added rfc Request for comment and feedback on a post, proposal, etc. module: kernels Issues related to kernel libraries, e.g. portable kernels and optimized kernels module: quantization labels Jan 15, 2025

mcr229 assigned mcr229, kimishpatel and manuelcandales Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Quantized op implementation: quantized_decomposed::dequantize_per_channel_group #7676

Quantized op implementation: quantized_decomposed::dequantize_per_channel_group #7676

sheetalarkadam commented Jan 15, 2025

mcr229 commented Jan 15, 2025

kimishpatel commented Jan 16, 2025

sheetalarkadam commented Jan 16, 2025

mcr229 commented Jan 16, 2025

Quantized op implementation: quantized_decomposed::dequantize_per_channel_group #7676

Quantized op implementation: quantized_decomposed::dequantize_per_channel_group #7676

Comments

sheetalarkadam commented Jan 15, 2025

🚀 The feature, motivation and pitch

Alternatives

Additional context

RFC (Optional)

mcr229 commented Jan 15, 2025

kimishpatel commented Jan 16, 2025

sheetalarkadam commented Jan 16, 2025

mcr229 commented Jan 16, 2025