Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance overhead of encoding/redacting #144

Open
PragTob opened this issue Nov 21, 2024 · 7 comments
Open

Performance overhead of encoding/redacting #144

PragTob opened this issue Nov 21, 2024 · 7 comments

Comments

@PragTob
Copy link
Contributor

PragTob commented Nov 21, 2024

👋

Not sure if this is a known issue but the performance overhead of the code in RedactorEncode is quite significant - even without any redactors as it traverses the entire data structure.

As in, I'd have expected the JSON encoding to be the most expensive formatting operation in the formatter - but in 2/3 scenarios the encoding & redacting (without any redactors) is ~twice as slow as the Jason encoding:

tobi@qiqi:~/github/elixir_playground(main)$ mix run bench/logger_json_overhead.exs 
Operating System: Linux
CPU Information: AMD Ryzen 9 5900X 12-Core Processor
Number of Available Cores: 24
Available memory: 31.26 GB
Elixir 1.17.3
Erlang 27.1
JIT enabled: true

Benchmark suite executing with the following configuration:
warmup: 100 ms
time: 1 s
memory time: 0 ns
reduction time: 0 ns
parallel: 1
inputs: just a msg, some map, bigger_map
Estimated total run time: 9 s 900 ms

# ...

##### With input just a msg #####
Name                          ips        average  deviation         median         99th %
just Jason                 2.74 M      365.26 ns ±10272.07%         260 ns         590 ns
logger_json encode         1.36 M      737.97 ns  ±3307.02%         630 ns        1080 ns
whole logger format        0.48 M     2082.71 ns   ±297.05%        1980 ns        3510 ns

Comparison: 
just Jason                 2.74 M
logger_json encode         1.36 M - 2.02x slower +372.71 ns
whole logger format        0.48 M - 5.70x slower +1717.45 ns

##### With input some map #####
Name                          ips        average  deviation         median         99th %
logger_json encode       766.24 K        1.31 μs   ±527.69%        1.21 μs        1.84 μs
just Jason               407.51 K        2.45 μs   ±483.13%        1.74 μs        5.87 μs
whole logger format      196.30 K        5.09 μs   ±162.16%        4.40 μs       14.59 μs

Comparison: 
logger_json encode       766.24 K
just Jason               407.51 K - 1.88x slower +1.15 μs
whole logger format      196.30 K - 3.90x slower +3.79 μs

##### With input bigger_map #####
Name                          ips        average  deviation         median         99th %
just Jason               271.29 K        3.69 μs   ±142.04%        3.45 μs        7.59 μs
logger_json encode        99.97 K       10.00 μs    ±13.77%        9.80 μs       12.89 μs
whole logger format       64.56 K       15.49 μs    ±32.81%       14.57 μs       34.33 μs

Comparison: 
just Jason               271.29 K
logger_json encode        99.97 K - 2.71x slower +6.32 μs
whole logger format       64.56 K - 4.20x slower +11.80 μs

Full benchmark (with some more logger stuff) here: https://github.com/PragTob/elixir_playground/blob/main/bench/logger_json_overhead.exs


It's still fast enough (I think) - it just feels odd.

I haven't done a profile to see what actually eats the most time (my suspicion is that the String.valid? and String.printable? checks are expensive).

A solution might be a more lightweight formatter that is more "use at your own risk" (i.e. expects you to do the work previously if necessary: redact things yourself beforehand, doesn't convert tuples, check string validity) but I suppose that might be its own project then almost. I also don't know how often these situations occur :)

@AndrewDryga
Copy link
Member

AndrewDryga commented Nov 21, 2024

Hello @PragTob,

Indeed this is a hard tradeoff, I think we might drop String checks but still will need to iterate over the struct because our goal is basically to guarantee that we never crash (and even now we can have occasional crashes). It was a major pain in the older versions of the library because crash reports then are not handled by logging aggregators and issues are not visible to the developers.

We had pretty large projects using LoggerJSON without an issue (DB and IO are still the bottleneck in 99% of the cases). I will take a look into it anyways.

BTW can you please share bench/logger_json_overhead.exs? :)

@PragTob
Copy link
Contributor Author

PragTob commented Nov 22, 2024

@AndrewDryga the benchmark is linked in the issue right beneath the benchmarking results. How would I ever post a benchmark without giving people the possibility to rerun it? 😁

And yeah, that's what I expected and to be fair in that benchmark you'll see that it's still faster than sending the data through the default formatter (which surprised me... still wondering if I'm missing a config option there).

So yeah, I don't think application holistically it matters. We're talking about microseconds here. I'm just me and so these things interest me & I look for ways to make it faster :)

@ruslandoga
Copy link

ruslandoga commented Nov 26, 2024

👋

One small but easy "win" might be to use Jason.Fragments to avoid re-visiting branches that were already safe-checked and encoded.

@AndrewDryga
Copy link
Member

@ruslandoga teah, the older package version did this but the performance benefit is negligible and, as I remember, you can't use it with anything that contains an unescaped value.

@ruslandoga
Copy link

ruslandoga commented Nov 27, 2024

The branch can be encoded before being wrapped in a fragment.

defmodule LoggerJSON do
  defmodule Log do
    @moduledoc false
    defstruct [:fields]

    defimpl Jason.Encoder do
      def encode(%{fields: fields}, opts) do
        Jason.Encode.keyword(fields, opts)
      end
    end
  end
  
  # ...
  
  # process some nested map
  def process_value(k, v) when is_map(v) do
    fields = process_kv(Map.to_list(map))

    nested =
      %Log{fields: fields}
      |> Jason.encode_to_iodata!()
      |> Jason.Fragment.new() # now Jason would skip this whole branch on "root" encode step

    {k, nested}
  end
  
  # ...
end

I'm exploring this approach in plausible/analytics#4855 and it's showing x1.5+ LoggerJSON performance. However, as of right now, it's not a fair comparison since I'm not doing invalid UTF8 escaping and I'm skipping non-keyword lists but I think there are still some ideas there that could potentially improve performance if applied to LoggerJSON. I'd be happy to open a few small micro-optimisation PRs if there is interest :)

Benchmark results
Code
# now = DateTime.utc_now()

inputs = [
  {"just a msg", {:string, "This is just some elaborate message"}},
  {"some map",
   {:report,
    %{
      message: "some other weirdo message",
      # time: DateTime.utc_now(),
      http_meta: %{
        status: 500,
        method: "GET"
        #  headers: [["what", "eva"], ["some-more", "stuff"]]
      }
    }}},
  {"bigger_map",
   {:report,
    %{
      "users" => %{
        "user_1" => %{
          "name" => "Alice",
          "age" => 30,
          "preferences" => %{
            "theme" => "dark",
            "language" => "English",
            "notifications" => %{
              "email" => true,
              "sms" => false,
              "push" => true
            }
          }
          #  "tags" => ["developer", "team_lead"]
        },
        "user_2" => %{
          "name" => "Bob",
          "age" => 25,
          "preferences" => %{
            "theme" => "light",
            "language" => "French",
            "notifications" => %{
              "email" => true,
              "sms" => true,
              "push" => false
            }
          }
          #  "tags" => ["designer", "remote"]
        }
      },
      "settings" => %{
        "global" => %{
          "timezone" => "UTC",
          "currency" => :usd,
          "support_contact" => "[email protected]"
        },
        "regional" => %{
          "US" => %{
            "timezone" => "America/New_York",
            "currency" => :usd
          },
          "EU" => %{
            "timezone" => "Europe/Berlin",
            "currency" => "EUR"
          }
        }
      },
      "analytics" => %{
        "page_views" => %{
          "home" => 1200,
          "about" => 450,
          "contact" => 300
        },
        "user_sessions" => %{
          "total" => 2000,
          "active" => 150
        }
      }
    }}}
]

redactors = []
{_, default_formatter_config} = Logger.Formatter.new(colors: [enabled?: false])

Benchee.run(
  %{
    "just Jason" => fn input ->
      Jason.encode_to_iodata!(elem(input, 1))
    end,
    "just :json" => fn input ->
      :json.encode(elem(input, 1))
    end,
    # "logger_json encode" => fn input ->
    #   %{message: LoggerJSON.Formatter.RedactorEncoder.encode(input, redactors)}
    # end,
    "whole LoggerJSON format" => fn input ->
      LoggerJSON.Formatters.Basic.format(%{level: :info, meta: %{}, msg: input}, [])
    end,
    "whole Plausible.Logger.JSONFormatter format" => fn input ->
      Plausible.Logger.JSONFormatter.format(
        %{level: :info, meta: %{}, msg: input},
        default_formatter_config
      )
    end,
    # odd that those 2 end up being the slowest - what additional work are they doing?
    "default formatter with report data (sanity check)" => fn input ->
      Logger.Formatter.format(
        %{level: :info, meta: %{}, msg: input},
        default_formatter_config
      )
    end
  },
  warmup: 0.1,
  time: 1,
  inputs: inputs
  # profile_after: true
)
##### With input just a msg #####
Name                                                        ips        average  deviation         median         99th %
just :json                                            8608.62 K       0.116 μs ±11742.55%      0.0840 μs       0.166 μs
just Jason                                            3493.35 K        0.29 μs  ±3949.86%        0.25 μs        0.46 μs
default formatter with report data (sanity check)      953.86 K        1.05 μs   ±767.04%           1 μs        1.29 μs
whole Plausible.Logger.JSONFormatter format            679.36 K        1.47 μs   ±460.89%        1.38 μs        1.75 μs
whole LoggerJSON format                                371.49 K        2.69 μs   ±133.99%        2.58 μs        3.83 μs

##### With input some map #####
Name                                                        ips        average  deviation         median         99th %
just :json                                            2211.28 K        0.45 μs  ±2681.92%        0.38 μs        0.67 μs
just Jason                                            1347.50 K        0.74 μs  ±1009.78%        0.71 μs        0.96 μs
whole Plausible.Logger.JSONFormatter format            468.56 K        2.13 μs   ±264.98%        2.04 μs        2.96 μs
whole LoggerJSON format                                321.20 K        3.11 μs    ±74.43%        2.96 μs        4.83 μs
default formatter with report data (sanity check)      160.58 K        6.23 μs    ±37.17%           6 μs       10.29 μs

##### With input bigger_map #####
Name                                                        ips        average  deviation         median         99th %
just :json                                             367.76 K        2.72 μs   ±137.50%        2.54 μs        4.71 μs
just Jason                                             224.45 K        4.46 μs   ±171.46%        4.25 μs        7.71 μs
whole Plausible.Logger.JSONFormatter format            113.60 K        8.80 μs    ±23.25%        8.46 μs       12.71 μs
whole LoggerJSON format                                 48.18 K       20.76 μs    ±16.55%       20.54 μs       25.43 μs
default formatter with report data (sanity check)       18.32 K       54.58 μs    ±11.71%       53.71 μs       78.02 μs

@PragTob
Copy link
Contributor Author

PragTob commented Nov 27, 2024

@ruslandoga I'd add in a variant with just the UTF-8 checking removed, depending on input I expect this to be the biggest chunk but I haven't profiled it yet although it would be trivial :)

@josevalim
Copy link

Some notes:

  • You shouldn't need both valid? and printable?, all printable? ones should be valid?

  • For performance, I would suggest using Erlang's new :json.encode_binary (you can backport it) wrapped in a try/catch. If it fails, you inspect. Then wrap it in a fragment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants