Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

int8*int8 -> float? #203

Open
XapaJIaMnu opened this issue Dec 19, 2020 · 3 comments
Open

int8*int8 -> float? #203

XapaJIaMnu opened this issue Dec 19, 2020 · 3 comments

Comments

@XapaJIaMnu
Copy link

Hey,

I'm looking to perform int8 * int8 -> fp32. where at the output stage I dequantise the int32_t result into float (and then potentially add a bias. I was following the example from https://github.com/google/gemmlowp/blob/master/doc/quantization_example.cc#L305
But it seems that in order to unquantise to float you compute the quantisation parameters from the fp32 result that you had already computed before, which in practise I wouldn't know. I can compute it with a compensation factor, but it becomes incredibly complicated and computationally (and memory) expensive. Any alternatives?

If I am able to assume quantisation into int8 as opposed to uint8 as in the example, I would be able to have quantisation without the zero_point parameter (assuming zero cantered distribution) which would massively simplify dequantisation. Do you support this? Do you have any examples in the codebase where something like this is done?

@bjacob
Copy link
Contributor

bjacob commented Dec 21, 2020

For such use cases, we typically have the matmul output raw int32 accumulators, then we do a pass outside of the matmul library converting those to float.

In gemmlowp, you get raw int32 accumulators simply by passing an empty output_pipeline, as in this part of the test:

gemmlowp/test/test.cc

Lines 1211 to 1230 in fda83bd

// Test an empty pipeline, i.e. returning raw int32 accumulators.
auto empty_pipeline = std::make_tuple();
GemmContext context;
GemmWithOutputPipeline<std::uint8_t, std::int32_t, DefaultL8R8BitDepthParams>(
&context, lhs.const_map(), rhs.const_map(), &result_raw_int32, lhs_offset,
rhs_offset, empty_pipeline);
for (int r = 0; r < rows; r++) {
for (int c = 0; c < cols; c++) {
std::int32_t expected = 0;
for (int d = 0; d < depth; d++) {
std::int32_t lhs_val =
static_cast<std::int32_t>(lhs(r, d)) + lhs_offset;
std::int32_t rhs_val =
static_cast<std::int32_t>(rhs(d, c)) + rhs_offset;
expected += lhs_val * rhs_val;
}
Check(expected == result_raw_int32(r, c));
}
}

May I suggest taking a look at the ruy library instead of gemmlowp. It's basically gemmlowp's successor, it's what TFLite has been using by default on ARM for 18 months now, it supports both float and quantized, any combination of int8 and uint8, zero point or not and more quantization flavor variations. I've added an example for getting raw int32 accumulators.
https://github.com/google/ruy/blob/878283640de7946a43053e8ebf4f15114fbc9156/example/example.cc#L129-L152

@XapaJIaMnu
Copy link
Author

@bjacob thank you that will do nicely. I think I'll use RUY.

Looking at the test, as far as I can see, only i8_i8_i32_i32 is supported, no i8_i8_i32_f32, so I'd have to do the float conversion outside of the multiply, correct?

@bjacob
Copy link
Contributor

bjacob commented Feb 2, 2021

Yes, exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants