Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid unnecessary decompression from buffer #72

Open
khuongduybui opened this issue Nov 28, 2023 · 0 comments
Open

Avoid unnecessary decompression from buffer #72

khuongduybui opened this issue Nov 28, 2023 · 0 comments

Comments

@khuongduybui
Copy link

Problem

I use compress gzip buffer (built-in, no plugin) and compress_request true with this http output plugin.
Fluentd attempts to gunzip the buffer from disk, which is then recompressed by this plugin.

Steps to replicate

# Upload configuration for Syslog events
<match syslog.events>
  @type http
  endpoint_url "https://<redacted>"
  http_method post

# send compressed events
  compress_request true
  serializer json
  buffered true
  bulk_request true
# specify recoverable/repeatable status codes
  recoverable_status_codes 404, 500, 502, 503, 504

  # every 5 minutes or every 10 MBs
  <buffer tag,time>
    @type file
    path /shared/logs_5/buffer/syslog
    timekey 5m
    timekey_wait 0m
    timekey_use_utc true
    chunk_limit_size 10MB
    compress gzip
    total_limit_size 50GB
    overflow_action drop_oldest_chunk
    retry_timeout 7d
    retry_max_interval 3600
  </buffer>
  <format>
    @type json
    add_newline true
  </format>
</match>

Expected Behavior or What you need to ask

According to Fluentd doc https://docs.fluentd.org/configuration/buffer-section#:~:text=Fluentd%20will%20decompress,plugin%20as%20is):

Fluentd will decompress these compressed chunks automatically before passing them to the output plugin (The exceptional case is when the output plugin can transfer data in compressed form. In this case, the data will be passed to the plugin as is).

Can we somehow let fluentd know that this output plugin can transfer data in compressed form and skip the decomp / re-comp?

The main reason why we came to this revelation is due to fluentd having errors sometimes when decompressing the gzip'ed buffer chunks and choke on it with the same up-to-1-week retry logic that we put in place for cases like network loss. We'd rather fluentd pass the bad chunks to this plugin, which sends them as-is to my endpoint in the cloud, where we have all the processing power to attempt to recover them or discard them without choking up the pipe.

Using Fluentd and out_http plugin versions

  • OS version: Debian 11
  • Bear Metal or Within Docker or Kubernetes or other: official Docker image
  • Fluentd version: 1.16.1
  • out_http plugin 1.3.4
abbrev (default: 0.1.0)
async (1.31.0)
async-http (0.60.1)
async-io (1.34.3)
async-pool (0.4.0)
base64 (default: 0.1.1)
benchmark (default: 0.2.0)
bigdecimal (default: 3.1.1)
bson (4.15.0)
bundler (default: 2.3.26)
cgi (default: 0.3.6)
concurrent-ruby (1.2.2)
console (1.16.2)
cool.io (1.7.1)
csv (default: 3.2.5)
date (default: 3.2.2)
debug (1.6.3)
delegate (default: 0.2.0)
did_you_mean (default: 1.6.1)
digest (default: 3.1.0)
drb (default: 2.1.0)
english (default: 0.7.1)
erb (default: 2.2.3)
error_highlight (default: 0.3.0)
etc (default: 1.3.0)
fcntl (default: 1.0.1)
fiber-local (1.0.0)
fiddle (default: 1.1.0)
fileutils (default: 1.6.0)
find (default: 0.1.1)
fluent-config-regexp-type (1.0.0)
fluent-plugin-mongo (1.6.0)
fluent-plugin-multi-format-parser (1.0.0)
fluent-plugin-out-http (1.3.4)
fluent-plugin-prometheus (2.1.0)
fluent-plugin-rewrite-tag-filter (2.4.0)
fluentd (1.16.1)
forwardable (default: 1.3.2)
getoptlong (default: 0.1.1)
http_parser.rb (0.8.0)
io-console (default: 0.5.11)
io-nonblock (default: 0.1.0)
io-wait (default: 0.2.1)
ipaddr (default: 1.2.4)
irb (default: 1.4.1)
json (2.6.3, default: 2.6.1)
logger (default: 1.5.0)
matrix (0.4.2)
minitest (5.15.0)
mongo (2.18.3)
msgpack (1.7.0)
mutex_m (default: 0.1.1)
net-ftp (0.1.3)
net-http (default: 0.3.0)
net-imap (0.2.3)
net-pop (0.1.1)
net-protocol (default: 0.1.2)
net-smtp (0.3.1)
nio4r (2.5.9)
nkf (default: 0.1.1)
observer (default: 0.1.1)
oj (3.14.3)
open-uri (default: 0.2.0)
open3 (default: 0.1.1)
openssl (default: 3.0.1)
optparse (default: 0.2.0)
ostruct (default: 0.5.2)
pathname (default: 0.2.0)
power_assert (2.0.1)
pp (default: 0.3.0)
prettyprint (default: 0.1.1)
prime (0.1.2)
prometheus-client (4.2.2)
protocol-hpack (1.4.2)
protocol-http (0.24.1)
protocol-http1 (0.15.0)
protocol-http2 (0.15.1)
pstore (default: 0.1.1)
psych (default: 4.0.4)
racc (default: 1.6.0)
rake (13.0.6)
rbs (2.7.0)
rdoc (default: 6.4.0)
readline (default: 0.0.3)
readline-ext (default: 0.1.4)
reline (default: 0.3.1)
resolv (default: 0.2.1)
resolv-replace (default: 0.1.0)
rexml (3.2.5)
rinda (default: 0.1.1)
rss (0.2.9)
ruby2_keywords (default: 0.0.5)
securerandom (default: 0.2.0)
serverengine (2.3.2)
set (default: 1.0.2)
shellwords (default: 0.1.0)
sigdump (0.2.4)
singleton (default: 0.1.1)
stringio (default: 3.0.1)
strptime (0.2.5)
strscan (default: 3.0.1)
syslog (default: 0.1.0)
tempfile (default: 0.1.2)
test-unit (3.5.3)
time (default: 0.2.2)
timeout (default: 0.2.0)
timers (4.3.5)
tmpdir (default: 0.1.2)
traces (0.9.1)
tsort (default: 0.1.0)
typeprof (0.21.3)
tzinfo (2.0.6)
tzinfo-data (1.2023.3)
un (default: 0.2.0)
uri (default: 0.12.1)
weakref (default: 0.1.1)
webrick (1.8.1)
yajl-ruby (1.4.3)
yaml (default: 0.2.0)
zlib (default: 2.1.1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant