-
Notifications
You must be signed in to change notification settings - Fork 482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add blog for sqfp16 #2971
Add blog for sqfp16 #2971
Conversation
Signed-off-by: Naveen Tatikonda <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
Signed-off-by: Fanit Kolchina <[email protected]>
@natebower The blog is ready for your review. Thank you! |
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
Signed-off-by: Naveen Tatikonda <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kolchfa-aws @naveentatikonda Please see my comments and changes and let me know if you have any questions. Thanks!
@pajuric This will be ready to publish once my comments/changes have been addressed.
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
|
||
`1.1 * (2 * 256 + 8 * 16) * 1,000,000 ~= 0.656 GB` | ||
|
||
For more information about memory estimation for scalar quantization with IVF, refer to [this documentation](https://opensearch.org/docs/latest/search-plugins/knn/knn-vector-quantization/#memory-estimation-1). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably define IVF here, given that it has a much more widely used definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe HNSW will be widely used than IVF due to drop in recall with IVF. So, we intentionally added estimates for HNSW here
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
Signed-off-by: kolchfa-aws <[email protected]>
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
ff2ce70
to
1df6805
Compare
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. There are a few small cleanups to be finished.
Signed-off-by: Naveen Tatikonda <[email protected]>
2a311a9
to
9077afd
Compare
Thanks @smacrakis @natebower Can you please do a final review and then we can ship it |
@naveentatikonda - I will have updated meta for you later today. It will need to be implemented before it can push live tomorrow. |
date: 2024-06-19 00:00:00 -0700 | ||
categories: | ||
- technical-posts | ||
meta_keywords: FP16 quantization, OpenSearch k-NN plugin, memory optimization, cost-effectiveness, performance, search latency, indexing throughput |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naveentatikonda - Please update your meta with the following:
meta_keywords: faiss scalar quantization, OpenSearch k-NN plugin, FP16 scalar quantization, vector embeddings
meta_description: Learn how FP16 scalar quantization in OpenSearch helps your generate vector embeddings while reducing memory requirements and minimizing quality loss at a lower cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pajuric in meta description, we are not generating vector embeddings with this feature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@naveentatikonda - I'm fine with your meta changes, but the phrase I pulled was from your blog. You may want to reevaluate that line. If you are good, I will push this to publish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pajuric I fixed it. We are good to go, pls publish it.Thanks!
_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md
Outdated
Show resolved
Hide resolved
Signed-off-by: Naveen Tatikonda <[email protected]>
If all of my comments have been addressed, then you should be good to go 😄. |
Sounds good. Thanks @natebower, they are all addressed |
Signed-off-by: Naveen Tatikonda <[email protected]>
5562fa7
to
ce24fbb
Compare
@nateynateynate @krisfreedain - New blog ready to publish today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good locally
Description
Blog for SQFP16 Quantization with k-NN Plugin
Issues Resolved
#2950
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.