-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Regression] Request decompression was broken in 2.11 #10802
Comments
Thanks for filing @kinseii - can you see if there are warnings or errors in the log at the time that occurs and post them here? Can you provide more information about the how to reproduce the issue (request + response with full headers)? With the change that @reta mentioned OpenSearch will skip a decompressing requests that are unauthenticated to preserve system resources. You can DM me @peternied on our slack instance if you'd prefer not to share the request data on this public issue. |
I think this would be the one suspect as well #10261, thanks @peternied |
I want to clarify that the problem occurs when
|
For some reason, the body of the log field contains binary information (most likely uncompressed). |
Here is an example of a normal recording, compression on the fluent-bit side is enabled. The
|
And here's an example of a record with binary data. Compression on fluent-bit is enabled:
|
log
field
I am willing to provide any information you need, please specify what you will need, as I myself have no idea how compression works in OpenSearch. |
Thank you for providing the configuration @kinseii. I was able to reproduce the issue and found the likely culprit. In 2.11 a change was made to keep unauthenticated request bodies compressed to avoid a tax of decompressing the request bodies. In opensearch-project/security#3418, the decompressor is replaced with a subclass of the Netty Thank you for reporting this issue. Link to PR to address the issue: opensearch-project/security#3583 |
Linking #10747 which looks like the same problem. |
@cwperks Is there a workaround for users or do they have to wait for 2.12? |
@dblock The workaround in 2.11.0 for the python client is to set |
log
field### Description Resolves an issue with decompression that can lead to concurrent gzipped requests failing. This removes the `@Sharable` annotation from the `Netty4ConditionalDecompressor` and creates a new instance of the decompressor on channel initialization. `Netty4ConditionalDecompressor` is an `HttpContentDecompressor` which is a subclass of `HttpContentDecoder` - a stateful handler. Netty docs on `@Sharable` annotation: https://netty.io/4.0/api/io/netty/channel/ChannelHandler.Sharable.html * Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation) Bug fix ### Issues Resolved - opensearch-project/OpenSearch#10802 ### Testing Tested by running OpenSearch w fluentbit and Merge_Log on. See files below which can reproduce the issue from the linked error. I opened this PR as draft pending an integration test to validate the behavior. `docker-compose.yml` ``` version: '3' services: opensearch: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/) image: opensearchproject/opensearch:latest # Specifying the latest available image - modify if you want a specific version container_name: opensearch environment: - cluster.name=opensearch-cluster # Name the cluster - node.name=opensearch # Name the node that will run in this container - discovery.type=single-node - bootstrap.memory_lock=true # Disable JVM heap memory swapping - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM ulimits: memlock: soft: -1 # Set memlock to unlimited (no soft or hard limit) hard: -1 nofile: soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536 hard: 65536 volumes: - opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container # - /Users/craigperkins/Projects/OpenSearch/security/build/distributions/opensearch-security-2.11.0.0-SNAPSHOT.jar:/usr/share/opensearch/plugins/opensearch-security/opensearch-security-2.11.0.0.jar ports: - 9200:9200 # REST API - 9600:9600 # Performance Analyzer networks: - opensearch-net # All of the containers will join the same Docker bridge network fluent-bit: image: fluent/fluent-bit volumes: - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf depends_on: - opensearch networks: - opensearch-net volumes: opensearch-data1: opensearch-data2: networks: opensearch-net: ``` `fluent-bit.conf` ``` [INPUT] Name dummy Dummy {"top": {".dotted": "value"}} [OUTPUT] Name es Host opensearch Port 9200 HTTP_User admin HTTP_Passwd admin Replace_Dots On Suppress_Type_Name On Compress gzip tls On tls.verify Off net.keepalive Off [FILTER] Name kubernetes Match kube.* Buffer_Size 256KB Merge_Log On Keep_Log On ``` ### Check List - [ ] New functionality includes testing - [ ] New functionality has been documented - [x] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Peter Nied <[email protected]> Co-authored-by: Peter Nied <[email protected]>
### Description Resolves an issue with decompression that can lead to concurrent gzipped requests failing. This removes the `@Sharable` annotation from the `Netty4ConditionalDecompressor` and creates a new instance of the decompressor on channel initialization. `Netty4ConditionalDecompressor` is an `HttpContentDecompressor` which is a subclass of `HttpContentDecoder` - a stateful handler. Netty docs on `@Sharable` annotation: https://netty.io/4.0/api/io/netty/channel/ChannelHandler.Sharable.html * Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation) Bug fix ### Issues Resolved - opensearch-project/OpenSearch#10802 ### Testing Tested by running OpenSearch w fluentbit and Merge_Log on. See files below which can reproduce the issue from the linked error. I opened this PR as draft pending an integration test to validate the behavior. `docker-compose.yml` ``` version: '3' services: opensearch: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/) image: opensearchproject/opensearch:latest # Specifying the latest available image - modify if you want a specific version container_name: opensearch environment: - cluster.name=opensearch-cluster # Name the cluster - node.name=opensearch # Name the node that will run in this container - discovery.type=single-node - bootstrap.memory_lock=true # Disable JVM heap memory swapping - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM ulimits: memlock: soft: -1 # Set memlock to unlimited (no soft or hard limit) hard: -1 nofile: soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536 hard: 65536 volumes: - opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container # - /Users/craigperkins/Projects/OpenSearch/security/build/distributions/opensearch-security-2.11.0.0-SNAPSHOT.jar:/usr/share/opensearch/plugins/opensearch-security/opensearch-security-2.11.0.0.jar ports: - 9200:9200 # REST API - 9600:9600 # Performance Analyzer networks: - opensearch-net # All of the containers will join the same Docker bridge network fluent-bit: image: fluent/fluent-bit volumes: - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf depends_on: - opensearch networks: - opensearch-net volumes: opensearch-data1: opensearch-data2: networks: opensearch-net: ``` `fluent-bit.conf` ``` [INPUT] Name dummy Dummy {"top": {".dotted": "value"}} [OUTPUT] Name es Host opensearch Port 9200 HTTP_User admin HTTP_Passwd admin Replace_Dots On Suppress_Type_Name On Compress gzip tls On tls.verify Off net.keepalive Off [FILTER] Name kubernetes Match kube.* Buffer_Size 256KB Merge_Log On Keep_Log On ``` ### Check List - [ ] New functionality includes testing - [ ] New functionality has been documented - [x] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Peter Nied <[email protected]> Co-authored-by: Peter Nied <[email protected]> (cherry picked from commit 499db78) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
### Description Resolves an issue with decompression that can lead to concurrent gzipped requests failing. This removes the `@Sharable` annotation from the `Netty4ConditionalDecompressor` and creates a new instance of the decompressor on channel initialization. `Netty4ConditionalDecompressor` is an `HttpContentDecompressor` which is a subclass of `HttpContentDecoder` - a stateful handler. Netty docs on `@Sharable` annotation: https://netty.io/4.0/api/io/netty/channel/ChannelHandler.Sharable.html * Category (Enhancement, New feature, Bug fix, Test fix, Refactoring, Maintenance, Documentation) Bug fix ### Issues Resolved - opensearch-project/OpenSearch#10802 ### Testing Tested by running OpenSearch w fluentbit and Merge_Log on. See files below which can reproduce the issue from the linked error. I opened this PR as draft pending an integration test to validate the behavior. `docker-compose.yml` ``` version: '3' services: opensearch: # This is also the hostname of the container within the Docker network (i.e. https://opensearch-node1/) image: opensearchproject/opensearch:latest # Specifying the latest available image - modify if you want a specific version container_name: opensearch environment: - cluster.name=opensearch-cluster # Name the cluster - node.name=opensearch # Name the node that will run in this container - discovery.type=single-node - bootstrap.memory_lock=true # Disable JVM heap memory swapping - "OPENSEARCH_JAVA_OPTS=-Xms512m -Xmx512m" # Set min and max JVM heap sizes to at least 50% of system RAM ulimits: memlock: soft: -1 # Set memlock to unlimited (no soft or hard limit) hard: -1 nofile: soft: 65536 # Maximum number of open files for the opensearch user - set to at least 65536 hard: 65536 volumes: - opensearch-data1:/usr/share/opensearch/data # Creates volume called opensearch-data1 and mounts it to the container # - /Users/craigperkins/Projects/OpenSearch/security/build/distributions/opensearch-security-2.11.0.0-SNAPSHOT.jar:/usr/share/opensearch/plugins/opensearch-security/opensearch-security-2.11.0.0.jar ports: - 9200:9200 # REST API - 9600:9600 # Performance Analyzer networks: - opensearch-net # All of the containers will join the same Docker bridge network fluent-bit: image: fluent/fluent-bit volumes: - ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf depends_on: - opensearch networks: - opensearch-net volumes: opensearch-data1: opensearch-data2: networks: opensearch-net: ``` `fluent-bit.conf` ``` [INPUT] Name dummy Dummy {"top": {".dotted": "value"}} [OUTPUT] Name es Host opensearch Port 9200 HTTP_User admin HTTP_Passwd admin Replace_Dots On Suppress_Type_Name On Compress gzip tls On tls.verify Off net.keepalive Off [FILTER] Name kubernetes Match kube.* Buffer_Size 256KB Merge_Log On Keep_Log On ``` ### Check List - [ ] New functionality includes testing - [ ] New functionality has been documented - [x] Commits are signed per the DCO using --signoff By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license. For more information on following Developer Certificate of Origin and signing off your commits, please check [here](https://github.com/opensearch-project/OpenSearch/blob/main/CONTRIBUTING.md#developer-certificate-of-origin). --------- Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Craig Perkins <[email protected]> Signed-off-by: Peter Nied <[email protected]> Co-authored-by: Peter Nied <[email protected]> (cherry picked from commit 499db78) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
This issue has been fixed and is waiting on the next release to be available. |
I'll let @bbarani comment on timeline feasibility. Quick clarifying question for me: is compression on by default? |
@peternied @cwperks Can you please confirm if |
The compression setting is from the external http clients - I don't know if our OpenSearch clients have this enabled or not, but OpenSearch is called by the wealth of http clients out there with different default configurations and settings - I would recommend everyone enable compression on their clients by default (sans this issue).
All clients are impacted which includes 3rd party clients. The way to enable/disable compression will vary, but the basic change will be required on all clients that call the OpenSearch cluster. This general approach is the only way to work around the problem.
Yes, disabling compression is a work around but this creates many classes of problems;
|
I have to admit that I am amazed at how complex, confusing and most importantly time consuming the workflow is and this coupled with a critical issue. I'm very grateful to the team for the product, but I would like to see critical issues resolved a little faster in the future) |
Describe the bug
I updated OS from 2.9.0 to 2.11.0 and binary information appeared in the indexes:
We use fluent-bit and it has a compression option:
Compress gzip
. Turning it off solved the problem. However, we can't permanently disable it because we have a lot of traffic and need to reduce its cost.Expected behavior
Compression should work on version 2.11.0.
Plugins
Host/Environment (please complete the following information):
The text was updated successfully, but these errors were encountered: