Add blog for sqfp16 #2971

naveentatikonda · 2024-06-17T20:09:28Z

Description

Blog for SQFP16 Quantization with k-NN Plugin

Issues Resolved

#2950

Check List

Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the BSD-3-Clause License.

Signed-off-by: Naveen Tatikonda <[email protected]>

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws · 2024-06-25T21:58:14Z

@natebower The blog is ready for your review. Thank you!

_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

Signed-off-by: Naveen Tatikonda <[email protected]>

natebower

@kolchfa-aws @naveentatikonda Please see my comments and changes and let me know if you have any questions. Thanks!

@pajuric This will be ready to publish once my comments/changes have been addressed.

_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

natebower · 2024-06-26T10:54:27Z

_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

+
+`1.1 * (2 * 256 + 8 * 16) * 1,000,000 ~= 0.656 GB`
+
+For more information about memory estimation for scalar quantization with IVF, refer to [this documentation](https://opensearch.org/docs/latest/search-plugins/knn/knn-vector-quantization/#memory-estimation-1).


We should probably define IVF here, given that it has a much more widely used definition.

I believe HNSW will be widely used than IVF due to drop in recall with IVF. So, we intentionally added estimates for HNSW here

_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

Signed-off-by: kolchfa-aws <[email protected]>

_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

smacrakis

Looks good. There are a few small cleanups to be finished.

Signed-off-by: Naveen Tatikonda <[email protected]>

naveentatikonda · 2024-06-26T21:00:04Z

Thanks @smacrakis

@natebower Can you please do a final review and then we can ship it

pajuric · 2024-06-26T23:27:53Z

@naveentatikonda - I will have updated meta for you later today. It will need to be implemented before it can push live tomorrow.

pajuric · 2024-06-26T23:56:08Z

_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

+date: 2024-06-19 00:00:00 -0700
+categories:
+  - technical-posts
+meta_keywords: FP16 quantization, OpenSearch k-NN plugin, memory optimization, cost-effectiveness, performance, search latency, indexing throughput


@naveentatikonda - Please update your meta with the following:

meta_keywords: faiss scalar quantization, OpenSearch k-NN plugin, FP16 scalar quantization, vector embeddings

meta_description: Learn how FP16 scalar quantization in OpenSearch helps your generate vector embeddings while reducing memory requirements and minimizing quality loss at a lower cost.

@pajuric in meta description, we are not generating vector embeddings with this feature

@naveentatikonda - I'm fine with your meta changes, but the phrase I pulled was from your blog. You may want to reevaluate that line. If you are good, I will push this to publish.

@pajuric I fixed it. We are good to go, pls publish it.Thanks!

_posts/2024-06-19-optimizing-opensearch-with-fp16-quantization.md

Signed-off-by: Naveen Tatikonda <[email protected]>

natebower · 2024-06-27T10:55:57Z

Thanks @smacrakis

@natebower Can you please do a final review and then we can ship it

If all of my comments have been addressed, then you should be good to go 😄.

naveentatikonda · 2024-06-27T15:46:02Z

If all of my comments have been addressed, then you should be good to go 😄.

Sounds good. Thanks @natebower, they are all addressed

Signed-off-by: Naveen Tatikonda <[email protected]>

pajuric · 2024-06-27T19:22:32Z

@nateynateynate @krisfreedain - New blog ready to publish today.

nateynateynate

looks good locally

Add blog for sqfp16

01a673d

Signed-off-by: Naveen Tatikonda <[email protected]>

naveentatikonda requested review from elfisher, AMoo-Miki, nknize, krisfreedain, peterzhuamazon, CEHENKLE, dtaivpp, kolchfa-aws, nateynateynate and natebower as code owners June 17, 2024 20:09

kolchfa-aws self-assigned this Jun 18, 2024

kolchfa-aws added 2 commits June 25, 2024 17:52

Doc review

ce48905

Signed-off-by: Fanit Kolchina <[email protected]>

Formatting

44ba0d4

Signed-off-by: Fanit Kolchina <[email protected]>

naveentatikonda commented Jun 26, 2024

View reviewed changes