Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for pruning neural sparse vectors #8984

Merged
merged 14 commits into from
Jan 2, 2025

Conversation

zhichao-aws
Copy link
Member

@zhichao-aws zhichao-aws commented Dec 24, 2024

Description

Add documentation for pruning neural sparse vectors.
Issue: opensearch-project/neural-search#946
PR: opensearch-project/neural-search#988

Issues Resolved

Version

2.19

Frontend features

N/A

Checklist

  • By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
    For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: zhichao-aws <[email protected]>
Copy link

Thank you for submitting your PR. The PR states are In progress (or Draft) -> Tech review -> Doc review -> Editorial review -> Merged.

Before you submit your PR for doc review, make sure the content is technically accurate. If you need help finding a tech reviewer, tag a maintainer.

When you're ready for doc review, tag the assignee of this PR. The doc reviewer may push edits to the PR directly or leave comments and editorial suggestions for you to address (let us know in a comment if you have a preference). The doc reviewer will arrange for an editorial review.

@zhichao-aws
Copy link
Member Author

The PR is ready for review now

Copy link
Collaborator

@kolchfa-aws kolchfa-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, @zhichao-aws! Some suggestions for you.

_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
zhichao-aws and others added 11 commits December 25, 2024 10:14
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
…-processor.md

Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
…-processor.md

Co-authored-by: kolchfa-aws <[email protected]>
Signed-off-by: zhichao-aws <[email protected]>
@zhichao-aws
Copy link
Member Author

Thanks @kolchfa-aws ! All suggest changes are taken :)

@kolchfa-aws kolchfa-aws added 5 - Editorial review PR: Editorial review in progress release-notes PR: Include this PR in the automated release notes v2.19.0 labels Dec 27, 2024
Copy link
Collaborator

@natebower natebower left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhichao-aws @kolchfa-aws Please see my comments and changes and let me know if you have any questions. Thanks!

_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
| Prune type | Valid prune ratio | Description |
|:---|:---|:---|
`max_ratio` | Float [0, 1) | Prunes a sparse vector by keeping only elements whose values are within the `prune_ratio` of the largest value in the vector.
abs_value | Float in (0, +∞) | Prunes a sparse vector by removing elements with values below the prune_ratio.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"below" => "lower than"?

`max_ratio` | Float [0, 1) | Prunes a sparse vector by keeping only elements whose values are within the `prune_ratio` of the largest value in the vector.
abs_value | Float in (0, +∞) | Prunes a sparse vector by removing elements with values below the prune_ratio.
alpha_mass | Float in [0, 1) | Prunes a sparse vector by keeping only elements whose cumulative sum of values is within the prune_ratio of the total sum.
`top_k` | Integer (0, +∞) | Prunes a sparse vector by keeping only the top `prune_ratio` elements with the highest values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can "top" be safely removed here (are "top" and "highest values" redundant)?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to remove "highest values".

_ingest-pipelines/processors/sparse-encoding.md Outdated Show resolved Hide resolved
Co-authored-by: Nathan Bower <[email protected]>
Signed-off-by: kolchfa-aws <[email protected]>
Copy link
Collaborator

@kolchfa-aws kolchfa-aws left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@kolchfa-aws kolchfa-aws merged commit 0848ba6 into opensearch-project:main Jan 2, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Editorial review PR: Editorial review in progress release-notes PR: Include this PR in the automated release notes v2.19.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants