Improve dependency validator's performance #417

xronos-i-am · 2024-11-07T06:48:22Z

What are you trying to accomplish?

Packwerk::Graph has its own implementation of the topological sorting algorithm. Ruby has stdlib module TSort that does the same thing (by the way, rails use it). I replace own implementation with TSort

What approach did you choose and why?

TSort works much faster (benchmarks attached) with almost same output results. And it does not require attention for support own implementation.

What should reviewers focus on?

The cycles shown in the test #cycles returns overlapping cycles in a graph differ from the original results. But the meaning remains the same (all cyclic dependencies are listed). Only the output format has changed (in some cases). I suppose that can be sacrificed in favour of performance

Type of Change

Basic functionality is intact, but there is a slight difference in the output of the results of dependency validator

Checklist

I have updated the documentation accordingly.
I have added tests to cover my changes.
It is safe to rollback this change.

Benchmarks

# frozen_string_literal: true

require "bundler/inline"
require "tsort"

gemfile do
  source "https://rubygems.org"

  gem "benchmark-ips", require: "benchmark/ips"
  gem "packwerk"
end

def arr_to_hsh(arr)
  arr.group_by(&:first).transform_values { |arr_list| arr_list.map(&:last) }
end

module Packwerk
  class GraphWithTsort
    include TSort

    extend T::Sig
    sig do
      params(
        # The edges of the graph; represented as an Hash of Arrays.
        edges: T::Hash[T.any(String, Integer, NilClass), T::Array[T.any(String, Integer, NilClass)]]
      ).void
    end
    def initialize(edges)
      @edges = edges
    end

    def cycles
      @cycles ||= strongly_connected_components.reject { _1.size == 1 }
    end

    def acyclic?
      cycles.empty?
    end

    private def tsort_each_node(&block)
      @edges.each_key(&block)
    end

    EMPTY_ARRAY = [].freeze
    private_constant :EMPTY_ARRAY

    private def tsort_each_child(node, &block)
      (@edges[node] || EMPTY_ARRAY).each(&block)
    end
  end

  private_constant :GraphWithTsort

  arr = [[1, 2], [1, 3], [2, 4], [3, 4]]
  hsh = arr_to_hsh(arr)

  Benchmark.ips do |x|
    x.report("[original] test acyclic graph") do
      Graph.new(arr).acyclic?
    end

    x.report("[tsort] test acyclic graph") do
      GraphWithTsort.new(hsh).acyclic?
    end

    x.compare!
  end

  # Warming up --------------------------------------
  # [original] test acyclic graph     3.167k i/100ms
  #    [tsort] test acyclic graph    21.181k i/100ms
  #
  # Calculating -------------------------------------
  # [original] test acyclic graph    30.041k (± 1.2%) i/s   (33.29 μs/i) -    152.016k in   5.060974s
  #    [tsort] test acyclic graph   197.479k (± 1.3%) i/s    (5.06 μs/i) -    995.507k in   5.041948s
  #
  # Comparison:
  #    [tsort] test acyclic graph:   197479.2 i/s
  # [original] test acyclic graph:    30041.5 i/s - 6.57x  slower

  arr = [[1, 2], [2, 3], [3, 1]]
  hsh = arr_to_hsh(arr)

  Benchmark.ips do |x|
    x.report("[original] test cyclic graph") do
      Graph.new(arr).acyclic?
    end

    x.report("[tsort] test cyclic graph") do
      GraphWithTsort.new(hsh).acyclic?
    end

    x.compare!
  end

  # Warming up --------------------------------------
  # [original] test cyclic graph
  #            3.076k i/100ms
  #    [tsort] test cyclic graph
  #            30.150k i/100ms
  # Calculating -------------------------------------
  # [original] test cyclic graph
  #            29.158k (± 4.3%) i/s   (34.30 μs/i) -    147.648k in   5.074331s
  #    [tsort] test cyclic graph
  #            251.707k (±12.1%) i/s   (3.97 μs/i) -      1.266M in   5.103329s
  #
  # Comparison:
  #    [tsort] test cyclic graph:   251707.4 i/s
  # [original] test cyclic graph:    29157.8 i/s - 8.63x  slower

  arr = [
    [1, 2], [2, 3], [3, 1],
    [4, 5], [4, 6], [5, 7], [6, 7],
    [8, 9], [9, 8], [8, 10], [10, 11], [8, 11],
  ]
  hsh = arr_to_hsh(arr)

  Benchmark.ips do |x|
    x.report("[original] test cycles in a graph with disjoint subgraphs") do
      Graph.new(arr).cycles
    end

    x.report("[tsort] test cycles in a graph with disjoint subgraphs") do
      GraphWithTsort.new(hsh).cycles
    end

    x.compare!
  end

  # Warming up --------------------------------------
  # [original] test cycles in a graph with disjoint subgraphs
  #            756.000 i/100ms
  #    [tsort] test cycles in a graph with disjoint subgraphs
  #            7.294k i/100ms
  # Calculating -------------------------------------
  # [original] test cycles in a graph with disjoint subgraphs
  #            7.031k (±11.0%) i/s  (142.24 μs/i) -   34.776k in   5.010366s
  #    [tsort] test cycles in a graph with disjoint subgraphs
  #            85.352k (±10.6%) i/s  (11.72 μs/i) -  423.052k in   5.016072s
  #
  # Comparison:
  #    [tsort] test cycles in a graph with disjoint subgraphs:    85351.8 i/s
  # [original] test cycles in a graph with disjoint subgraphs:     7030.5 i/s - 12.14x  slower
end

xronos-i-am · 2024-11-07T08:00:50Z

I have signed the CLA!

improve dependency validator's performance

6577a70

xronos-i-am requested a review from a team as a code owner November 7, 2024 06:48

github-actions bot added the cla-needed label Nov 7, 2024

github-actions bot removed the cla-needed label Nov 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve dependency validator's performance #417

Improve dependency validator's performance #417

xronos-i-am commented Nov 7, 2024 •

edited

Loading

xronos-i-am commented Nov 7, 2024

Improve dependency validator's performance #417

Are you sure you want to change the base?

Improve dependency validator's performance #417

Conversation

xronos-i-am commented Nov 7, 2024 • edited Loading

What are you trying to accomplish?

What approach did you choose and why?

What should reviewers focus on?

Type of Change

Checklist

Benchmarks

xronos-i-am commented Nov 7, 2024

xronos-i-am commented Nov 7, 2024 •

edited

Loading