Skip to content

Commit

Permalink
optimize the frequency sketch
Browse files Browse the repository at this point in the history
In an earlier analysis the block-based sketch was significantly faster
than the flat (uniform) one. This was independently confirmed by a C#
and Go port, who also observed a 2x speed up. However, when recently
adding this benchmark to the CI it showed it as a regression. Therefore
some implicit compiler optimizations are now explicit, which allows the
block-based sketch to match or exceed the flat-based performance.

- We no longer rely on escape analysis to optimize away the method
scoped arrays (count, index). These should have been stack allocated
and broken into their components.
- The arrays were meant to break a loop data dependency, but it is now
faster to keep that. `Math.min` is a single cycle, branch-free
instruction that the OOO pipeline seems to prefer.
- `increment` is manually loop unrolled like the flat version, which
shows a simiar speed up.
- Previously, the flat benchmark version implemented the scaffolding
interface directly, was pre-allocated, and the init guard was removed.
This gave it a large advantage as it improved inlining, branch
prediction, etc. The benchmark is now fair.
- For jdk11 the block is always faster by at least 10M ops/s. In jdk23
the speedup only occurs as the table size increases, matching the
expected gains from better cache effects. It is marginally slower on
the small table size due to indexing differences.

The differences are very hardware and compiler dependent, as there are
wide variations when running on Intel, Arm, Java versions, and JVMs
(Graal vs C2). The user effect will be noise since this was not a
performance bottleneck due to the cache's overall design.
  • Loading branch information
ben-manes committed Nov 16, 2024
1 parent 323e902 commit 91a36fb
Show file tree
Hide file tree
Showing 30 changed files with 114 additions and 53 deletions.
2 changes: 1 addition & 1 deletion .github/actions/run-gradle/action.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ runs:
echo "JDK_CI=$JAVA_HOME" >> $GITHUB_ENV
echo "JDK_EA=${{ inputs.early-access == inputs.java }}" >> $GITHUB_ENV
- name: Setup Gradle
uses: gradle/actions/setup-gradle@d156388eb19639ec20ade50009f3d199ce1e2808 # v4.1.0
uses: gradle/actions/setup-gradle@473878a77f1b98e2b5ac4af93489d1656a80a5ed # v4.2.0
env:
ORG_GRADLE_PROJECT_org.gradle.java.installations.auto-download: 'false'
with:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -250,7 +250,7 @@ jobs:
cache-encryption-key: ${{ secrets.GRADLE_ENCRYPTION_KEY }}
continue-on-error: true
- name: Publish to Codecov
uses: codecov/codecov-action@b9fd7d16f6d7d1b5d2bec1a2887e65ceed900238 # v4.6.0
uses: codecov/codecov-action@5c47607acb93fed5485fdbf7232e8a31425f672a # v5.0.2
with:
token: ${{ secrets.CODECOV_TOKEN }}
- name: Publish to Codacy
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/codacy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ jobs:
if: steps.check_files.outputs.files_exists == 'true'
run: jq -c '.runs |= unique_by({tool, invocations, results})' < results.sarif > codacy.sarif
- name: Upload result to GitHub Code Scanning
uses: github/codeql-action/upload-sarif@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/upload-sarif@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
if: steps.check_files.outputs.files_exists == 'true'
continue-on-error: true
with:
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,10 +57,10 @@ jobs:
java: ${{ env.JAVA_VERSION }}
cache-encryption-key: ${{ secrets.GRADLE_ENCRYPTION_KEY }}
- name: Initialize CodeQL
uses: github/codeql-action/init@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/init@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
with:
languages: java
- name: Autobuild
uses: github/codeql-action/autobuild@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/autobuild@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/analyze@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
2 changes: 1 addition & 1 deletion .github/workflows/dependency-check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ jobs:
with:
files: build/reports/dependency-check-report.sarif
- name: Upload result to GitHub Code Scanning
uses: github/codeql-action/upload-sarif@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/upload-sarif@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
if: steps.check_files.outputs.files_exists == 'true'
with:
sarif_file: build/reports/dependency-check-report.sarif
2 changes: 1 addition & 1 deletion .github/workflows/dependency-submission-pr-retreive.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,6 @@ jobs:
repo1.maven.org:443
services.gradle.org:443
- name: Retrieve and submit dependency graph
uses: gradle/actions/dependency-submission@d156388eb19639ec20ade50009f3d199ce1e2808 # v4.1.0
uses: gradle/actions/dependency-submission@473878a77f1b98e2b5ac4af93489d1656a80a5ed # v4.2.0
with:
dependency-graph: download-and-submit
2 changes: 1 addition & 1 deletion .github/workflows/dependency-submission-pr-submit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ jobs:
java-version: ${{ env.JAVA_VERSION }}
distribution: temurin
- name: Submit Dependency Graph
uses: gradle/actions/dependency-submission@d156388eb19639ec20ade50009f3d199ce1e2808 # v4.1.0
uses: gradle/actions/dependency-submission@473878a77f1b98e2b5ac4af93489d1656a80a5ed # v4.2.0
with:
cache-encryption-key: ${{ secrets.GRADLE_ENCRYPTION_KEY }}
dependency-graph: generate-and-upload
2 changes: 1 addition & 1 deletion .github/workflows/dependency-submission.yml
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,6 @@ jobs:
java-version: ${{ env.JAVA_VERSION }}
distribution: temurin
- name: Submit Dependency Graph
uses: gradle/actions/dependency-submission@d156388eb19639ec20ade50009f3d199ce1e2808 # v4.1.0
uses: gradle/actions/dependency-submission@473878a77f1b98e2b5ac4af93489d1656a80a5ed # v4.2.0
with:
cache-encryption-key: ${{ secrets.GRADLE_ENCRYPTION_KEY }}
2 changes: 1 addition & 1 deletion .github/workflows/devskim.yml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,6 @@ jobs:
- name: Run DevSkim scanner
uses: microsoft/DevSkim-Action@914fa647b406c387000300b2f09bb28691be2b6d # v1.0.14
- name: Upload DevSkim scan results to GitHub Security tab
uses: github/codeql-action/upload-sarif@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/upload-sarif@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
with:
sarif_file: devskim-results.sarif
2 changes: 1 addition & 1 deletion .github/workflows/gradle-wrapper-validation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ jobs:
github.com:443
services.gradle.org:443
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
- uses: gradle/actions/wrapper-validation@d156388eb19639ec20ade50009f3d199ce1e2808 # v4.1.0
- uses: gradle/actions/wrapper-validation@473878a77f1b98e2b5ac4af93489d1656a80a5ed # v4.2.0
2 changes: 1 addition & 1 deletion .github/workflows/qodana.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,6 @@ jobs:
upload-result: true
github-token: ${{ secrets.GITHUB_TOKEN }}
- name: Upload SARIF file for GitHub Advanced Security Dashboard
uses: github/codeql-action/upload-sarif@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/upload-sarif@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
with:
sarif_file: ${{ runner.temp }}/qodana/results/qodana.sarif.json
2 changes: 1 addition & 1 deletion .github/workflows/scorecards-analysis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,6 @@ jobs:
path: results.sarif
retention-days: 5
- name: Upload to code-scanning
uses: github/codeql-action/upload-sarif@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/upload-sarif@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
with:
sarif_file: results.sarif
2 changes: 1 addition & 1 deletion .github/workflows/semgrep.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
if: steps.check_files.outputs.files_exists == 'true'
run: jq -c '.runs[0].tool.driver.rules |= unique_by(.id)' < results.sarif > semgrep.sarif
- name: Upload SARIF file for GitHub Advanced Security Dashboard
uses: github/codeql-action/upload-sarif@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/upload-sarif@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
if: steps.check_files.outputs.files_exists == 'true'
continue-on-error: true
with:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/snyk.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
with:
files: snyk.sarif
- name: Upload result to GitHub Code Scanning
uses: github/codeql-action/upload-sarif@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/upload-sarif@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
if: steps.check_files.outputs.files_exists == 'true'
with:
sarif_file: snyk.sarif
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/trivy.yml
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ jobs:
with:
files: results.sarif
- name: Upload result to GitHub Code Scanning
uses: github/codeql-action/upload-sarif@4f3212b61783c3c68e8309a0f18a699764811cda # v3.27.1
uses: github/codeql-action/upload-sarif@ea9e4e37992a54ee68a9622e985e60c8e8f12d9f # v3.27.4
if: steps.check_files.outputs.files_exists == 'true'
with:
sarif_file: results.sarif
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,19 @@
public enum SketchType {
Flat {
@Override public <E> TinyLfuSketch<E> create(long estimatedSize) {
return new CountMinSketch<>(estimatedSize);
var frequencySketch = new CountMinSketch<E>();
frequencySketch.ensureCapacity(estimatedSize);
return new TinyLfuSketch<>() {
@Override public int frequency(E e) {
return frequencySketch.frequency(e);
}
@Override public void increment(E e) {
frequencySketch.increment(e);
}
@Override public void reset() {
frequencySketch.reset();
}
};
}
},
Block {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
*
* @author [email protected] (Ben Manes)
*/
public final class CountMinSketch<E> implements TinyLfuSketch<E> {
public final class CountMinSketch<E> {

/*
* This class maintains a 4-bit CountMinSketch [1] with periodic aging to provide the popularity
Expand Down Expand Up @@ -63,9 +63,27 @@ public final class CountMinSketch<E> implements TinyLfuSketch<E> {
long[] table;
int size;

public CountMinSketch(@NonNegative long maximumSize) {
/**
* Creates a lazily initialized frequency sketch, requiring {@link #ensureCapacity} be called
* when the maximum size of the cache has been determined.
*/
@SuppressWarnings({"NullAway.Init", "PMD.UnnecessaryConstructor"})
public CountMinSketch() {}

/**
* Initializes and increases the capacity of this <tt>FrequencySketch</tt> instance, if necessary,
* to ensure that it can accurately estimate the popularity of elements given the maximum size of
* the cache. This operation forgets all previous counts when resizing.
*
* @param maximumSize the maximum size of the cache
*/
public void ensureCapacity(@NonNegative long maximumSize) {
checkArgument(maximumSize >= 0);
int maximum = (int) Math.min(maximumSize, Integer.MAX_VALUE >>> 1);
if ((table != null) && (table.length >= maximum)) {
return;
}

table = new long[(maximum == 0) ? 1 : IntMath.ceilingPowerOfTwo(maximum)];
tableMask = Math.max(0, table.length - 1);
sampleSize = (maximumSize == 0) ? 10 : (10 * maximum);
Expand All @@ -75,14 +93,25 @@ public CountMinSketch(@NonNegative long maximumSize) {
size = 0;
}

/**
* Returns if the sketch has not yet been initialized, requiring that {@link #ensureCapacity} is
* called before it begins to track frequencies.
*/
public boolean isNotInitialized() {
return (table == null);
}

/**
* Returns the estimated number of occurrences of an element, up to the maximum (15).
*
* @param e the element to count occurrences of
* @return the estimated number of occurrences of the element; possibly zero but never negative
*/
@Override
public @NonNegative int frequency(E e) {
if (isNotInitialized()) {
return 0;
}

int hash = spread(e.hashCode());
int start = (hash & 3) << 2;
int frequency = Integer.MAX_VALUE;
Expand All @@ -101,12 +130,15 @@ public CountMinSketch(@NonNegative long maximumSize) {
*
* @param e the element to add
*/
@Override
public void increment(E e) {
if (isNotInitialized()) {
return;
}

int hash = spread(e.hashCode());
int start = (hash & 3) << 2;

// Loop unrolling improves throughput
// Loop unrolling improves throughput by 5m ops/s
int index0 = indexOf(hash, 0);
int index1 = indexOf(hash, 1);
int index2 = indexOf(hash, 2);
Expand Down Expand Up @@ -140,7 +172,6 @@ boolean incrementAt(int i, int j) {
}

/** Reduces every counter by half of its original value. */
@Override
public void reset() {
int count = 0;
for (int i = 0; i < table.length; i++) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -118,22 +118,25 @@ public boolean isNotInitialized() {
* @param e the element to count occurrences of
* @return the estimated number of occurrences of the element; possibly zero but never negative
*/
@SuppressWarnings("Varifier")
public @NonNegative int frequency(E e) {
if (isNotInitialized()) {
return 0;
}

int[] count = new int[4];
@Var int frequency = Integer.MAX_VALUE;
int blockHash = spread(e.hashCode());
int counterHash = rehash(blockHash);
int block = (blockHash & blockMask) << 3;
for (int i = 0; i < 4; i++) {
int h = counterHash >>> (i << 3);
int index = (h >>> 1) & 15;
int offset = h & 1;
count[i] = (int) ((table[block + offset + (i << 1)] >>> (index << 2)) & 0xfL);
int slot = block + offset + (i << 1);
int count = (int) ((table[slot] >>> (index << 2)) & 0xfL);
frequency = Math.min(frequency, count);
}
return Math.min(Math.min(count[0], count[1]), Math.min(count[2], count[3]));
return frequency;
}

/**
Expand All @@ -143,27 +146,37 @@ public boolean isNotInitialized() {
*
* @param e the element to add
*/
@SuppressWarnings("ShortCircuitBoolean")
@SuppressWarnings({"ShortCircuitBoolean", "UnnecessaryLocalVariable"})
public void increment(E e) {
if (isNotInitialized()) {
return;
}

int[] index = new int[8];
int blockHash = spread(e.hashCode());
int counterHash = rehash(blockHash);
int block = (blockHash & blockMask) << 3;
for (int i = 0; i < 4; i++) {
int h = counterHash >>> (i << 3);
index[i] = (h >>> 1) & 15;
int offset = h & 1;
index[i + 4] = block + offset + (i << 1);
}

// Loop unrolling improves throughput by 10m ops/s
int h0 = counterHash;
int h1 = counterHash >>> 8;
int h2 = counterHash >>> 16;
int h3 = counterHash >>> 24;

int index0 = (h0 >>> 1) & 15;
int index1 = (h1 >>> 1) & 15;
int index2 = (h2 >>> 1) & 15;
int index3 = (h3 >>> 1) & 15;

int slot0 = block + (h0 & 1);
int slot1 = block + (h1 & 1) + 2;
int slot2 = block + (h2 & 1) + 4;
int slot3 = block + (h3 & 1) + 6;

boolean added =
incrementAt(index[4], index[0])
| incrementAt(index[5], index[1])
| incrementAt(index[6], index[2])
| incrementAt(index[7], index[3]);
incrementAt(slot0, index0)
| incrementAt(slot1, index1)
| incrementAt(slot2, index2)
| incrementAt(slot3, index3);

if (added && (++size == sampleSize)) {
reset();
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[versions]
caffeine = "3.1.8"
junit = "5.11.3"
reactor = "3.6.11"
reactor = "3.7.0"
truth = "1.4.4"
versions = "0.51.0"

Expand Down
2 changes: 1 addition & 1 deletion examples/coalescing-bulkloader-reactor/settings.gradle.kts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
plugins {
id("com.gradle.develocity") version "3.18.1"
id("com.gradle.develocity") version "3.18.2"
id("com.gradle.common-custom-user-data-gradle-plugin") version "2.0.2"
id("org.gradle.toolchains.foojay-resolver-convention") version "0.8.0"
}
Expand Down
2 changes: 1 addition & 1 deletion examples/graal-native/settings.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ pluginManagement {
}
}
plugins {
id("com.gradle.develocity") version "3.18.1"
id("com.gradle.develocity") version "3.18.2"
id("com.gradle.common-custom-user-data-gradle-plugin") version "2.0.2"
id("org.gradle.toolchains.foojay-resolver-convention") version "0.8.0"
}
Expand Down
4 changes: 2 additions & 2 deletions examples/hibernate/gradle/libs.versions.toml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[versions]
caffeine = "3.1.8"
h2 = "2.3.232"
hibernate = "7.0.0.Beta1"
hibernate = "7.0.0.Beta2"
junit = "5.11.3"
log4j2 = "3.0.0-beta2"
truth = "1.4.4"
Expand All @@ -16,7 +16,7 @@ hibernate-jpamodelgen = { module = "org.hibernate.orm:hibernate-jpamodelgen", ve
hibernate-hikaricp = { module = "org.hibernate.orm:hibernate-hikaricp", version.ref = "hibernate" }
junit = { module = "org.junit.jupiter:junit-jupiter", version.ref = "junit" }
log4j2-core = { module = "org.apache.logging.log4j:log4j-core", version.ref = "log4j2" }
log4j2-slf4j = { module = "org.apache.logging.log4j:log4j-slf4j-impl", version.ref = "log4j2" }
log4j2-slf4j = { module = "org.apache.logging.log4j:log4j-slf4j2-impl", version.ref = "log4j2" }
truth = { module = "com.google.truth:truth", version.ref = "truth" }

[bundles]
Expand Down
2 changes: 1 addition & 1 deletion examples/hibernate/settings.gradle.kts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
plugins {
id("com.gradle.develocity") version "3.18.1"
id("com.gradle.develocity") version "3.18.2"
id("com.gradle.common-custom-user-data-gradle-plugin") version "2.0.2"
id("org.gradle.toolchains.foojay-resolver-convention") version "0.8.0"
}
Expand Down
2 changes: 1 addition & 1 deletion examples/indexable/settings.gradle.kts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
plugins {
id("com.gradle.develocity") version "3.18.1"
id("com.gradle.develocity") version "3.18.2"
id("com.gradle.common-custom-user-data-gradle-plugin") version "2.0.2"
id("org.gradle.toolchains.foojay-resolver-convention") version "0.8.0"
}
Expand Down
2 changes: 1 addition & 1 deletion examples/resilience-failsafe/settings.gradle.kts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
plugins {
id("com.gradle.develocity") version "3.18.1"
id("com.gradle.develocity") version "3.18.2"
id("com.gradle.common-custom-user-data-gradle-plugin") version "2.0.2"
id("org.gradle.toolchains.foojay-resolver-convention") version "0.8.0"
}
Expand Down
2 changes: 1 addition & 1 deletion examples/write-behind-rxjava/settings.gradle.kts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
plugins {
id("com.gradle.develocity") version "3.18.1"
id("com.gradle.develocity") version "3.18.2"
id("com.gradle.common-custom-user-data-gradle-plugin") version "2.0.2"
id("org.gradle.toolchains.foojay-resolver-convention") version "0.8.0"
}
Expand Down
Loading

0 comments on commit 91a36fb

Please sign in to comment.