Skip to content
This repository has been archived by the owner on Jul 7, 2020. It is now read-only.

p10 > p50, p90 > p99 for a simple digest #138

Open
rodrigofarnhamsc opened this issue Jul 14, 2017 · 9 comments
Open

p10 > p50, p90 > p99 for a simple digest #138

rodrigofarnhamsc opened this issue Jul 14, 2017 · 9 comments

Comments

@rodrigofarnhamsc
Copy link

        TDigest tdigest = new TDigest(100);
        tdigest.add(0.18615591526031494);
        tdigest.add(0.4241943657398224);
        tdigest.add(0.8813006281852722);

        System.out.println("p10: " + tdigest.quantile(0.1));
        System.out.println("p50: " + tdigest.quantile(0.5));
        System.out.println("p90: " + tdigest.quantile(0.9));
        System.out.println("p95: " + tdigest.quantile(0.95));
        System.out.println("p99: " + tdigest.quantile(0.99));

The output doesn't look right:

p10: 0.35278283059597015
p50: 0.30517514050006866
p90: 1.018432506918907
p95: 0.9498665675520899
p99: 0.8950138160586358
  • p10 > p50
  • p90 > p95 > p99

Should I use https://github.com/tdunning/t-digest/blob/master/src/main/java/com/tdunning/math/stats/MergingDigest.java instead?

@tdunning
Copy link
Contributor

tdunning commented Jul 14, 2017 via email

@rodrigofarnhamsc
Copy link
Author

Hi @tdunning, thanks for your prompt reply.
I'm using stream-2.9.5.jar

I'll look into switching to MergingDigest, but it might be tricky. I'm currently persisting TDigest bytes into a database and retrieving it upon querying.

@rodrigofarnhamsc
Copy link
Author

rodrigofarnhamsc commented Jul 15, 2017

So luckily it seems that TDigest is byte compatible with AVLTreeDigest. I'm using the following hack to migrate in a "backwards compatible" way:

        byte value = bytes[bytes.length - 1];

        DigestType mode = Arrays.stream(DigestType.values())
                .filter(e -> e.value == value)
                .findFirst()
                .orElse(DigestType.UNKNOWN);

        try {
            ByteBuffer bb = ByteBuffer.wrap(bytes);
            switch (mode) {
                case AVL_DIGEST:
                    return AVLTreeDigest.fromBytes(bb);
                case MERGING_DIGEST:
                    return MergingDigest.fromBytes(bb);
                case UNKNOWN:
                default:
                    // Old serialization lacking trailing byte (TDigest from stream-lib, byte compatible with AVLTreeDigest).
                    return AVLTreeDigest.fromBytes(bb);
            }
        } catch (Exception e) {
            // Old serialization lacking trailing byte (TDigest from stream-lib, byte compatible with AVLTreeDigest).
            return AVLTreeDigest.fromBytes(ByteBuffer.wrap(bytes));
        }

@tdunning
Copy link
Contributor

tdunning commented Jul 15, 2017 via email

@tdunning
Copy link
Contributor

I just added TDigestTest.testThreePointExample and it looks better than what you have. I think you have a really old version. Here is what I get with trunk:

digest: class com.tdunning.math.stats.AVLTreeDigest
p10: 0.18615591526031494
p50: 0.4241943657398224
p90: 0.8813006281852722
p95: 0.8813006281852722
p99: 0.8813006281852722

digest: class com.tdunning.math.stats.MergingDigest
p10: 0.18615591526031494
p50: 0.4241943657398224
p90: 0.8813006281852722
p95: 0.8813006281852722
p99: 0.8813006281852722

@rodrigofarnhamsc
Copy link
Author

My initial results were from package com.clearspring.analytics.stream.quantile

The TDigest therein does not extend from com.tdunning.math.stats.TDigest (which is a skeleton abstract class).

From inspecting com.clearspring.analytics.stream.quantile.TDigest implementation I guessed it was similar to com.tdunning.math.stats.AVLTreeDigest. At the very least, their serializations are compatible :)

@rodrigofarnhamsc
Copy link
Author

rodrigofarnhamsc commented Jul 15, 2017

For complete clarity, this is a code excerpt showing the problem, as well as the encoded TDigest:

        com.clearspring.analytics.stream.quantile.TDigest tdigest = new com.clearspring.analytics.stream.quantile.TDigest(100);
        tdigest.add(0.18615591526031494);
        tdigest.add(0.4241943657398224);
        tdigest.add(0.8813006281852722);

        ByteBuffer bb = ByteBuffer.allocate(tdigest.byteSize());
        tdigest.asSmallBytes(bb);
        byte[] enc = Base64.getEncoder().encode(Arrays.copyOf(bb.array(), bb.position()));

        System.out.println("p10: " + tdigest.quantile(0.1));
        System.out.println("p50: " + tdigest.quantile(0.5));
        System.out.println("p90: " + tdigest.quantile(0.9));
        System.out.println("p95: " + tdigest.quantile(0.95));
        System.out.println("p99: " + tdigest.quantile(0.99));

        System.out.println();
        System.out.println(new String(enc));
        System.out.println();

        tdigest = TDigest.fromBytes(ByteBuffer.wrap(Base64.getDecoder().decode(enc)));
        System.out.println("p10: " + tdigest.quantile(0.1));
        System.out.println("p50: " + tdigest.quantile(0.5));
        System.out.println("p90: " + tdigest.quantile(0.9));
        System.out.println("p95: " + tdigest.quantile(0.95));
        System.out.println("p99: " + tdigest.quantile(0.99));

Output:

p10: 0.35278283059597015
p50: 0.30517514050006866
p90: 1.018432506918907
p95: 0.9498665675520899
p99: 0.8950138160586358

AAAAAkBZAAAAAAAAAAAAAz4+n6g+c8BaPuoJ1QEBAQ==

p10: 0.35278283059597015
p50: 0.30517514050006866
p90: 1.018432506918907
p95: 0.9498665675520899
p99: 0.8950138160586358

@tdunning
Copy link
Contributor

tdunning commented Jul 15, 2017 via email

@rodrigofarnhamsc
Copy link
Author

 AAAAAkBZAAAAAAAAAAAAAz3Qbs0+UeM2PvLtoAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8b01s+HTH4PnDQgAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz5o2zY9NFXoPWP3kAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz1Jzuw+laGKPMdGqAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz77Rmc9hLDEPh7yYAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4yDLE+YzKpPpeAdwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6UTDY+E8HGPVfEmAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAzjYrWA/RXwUPYNSkwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz2B/Pk9x6AlPj9ErwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz9gyDQ9eUAwPEM2gAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz0oKyI+gowYPtZ15AEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8FwOQ56XAAPoH/qAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz5bFPU96p96Pv+WDwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz0Zun4/Hr1dPXF1DgEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4nq44+trq9POQ04AEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz5gwOk+UeXZPo67fwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8l+G0+CuOAPSSUYAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAzzBHd4/Al9sPpdWhAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4Qmw4+o/tqPwDGyAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4ftus+OdZHPicLMQEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8kJs49YD9gPjX/FAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz33h/Y/J4B7Pcz/NgEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4fqdU/QLVtPRqRkAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz9W7qQ9uzXAPW4IEAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6Fpck8NoGAPx2JcAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6Ey+Y+oYqgPqkndgEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz5EhJ4/EXvqPkQh2AEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4IoD4+0oItPg5eeAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4hUyo+xiClPthRZAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz7Wrmg92XY4PS7QAAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz12V4M/NJYMPHt4zAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz5ysJU9oYnWPxUVYAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAzxVuYI9Y4OOPwR7HwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4fi+U+vCC8PgbnFwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz7onhk9EA+4PpHf1gEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz49KeQ+y2haPavuOAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6Nauc/BHRkPVltcAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8lk5Q+hj/yPRIdkAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz5I+Dk/HqDrPV4CdAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8drbM+B/nMPgbWBAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz2YvpM927k7PRlmYgEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAT9NC04BAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAABT4bBrI+H6YIPaq1rz7vKw09SCmgAQEBAQEAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz3ePLI+p+IkPkhsZQEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4OHsw+fEGmPo2eQQEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6bgX09j65EPjGtgAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6T3KA9qbGcPu0ePwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAzyN2lg/AkcCPslzcgEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8BuQY+DMGEPZlfQAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz82sDg90elYPiYMfAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz58x1c/BiLuPYGzCgEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz3mSFM+q9AIPozP3wEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz5KzCw+p0QsPjb0BAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6A8Jw8wTLAPypFagEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6cBmE+W6HyPmc8ZAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAzy3ScE+eVGoPzBzaAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8FeTM9U/0gPr5slgEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6nQGk+rjcfPpYwEAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz7Frg49g28QPhoKXAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz9epRc8qNuAPRp20AEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAzyZaj0+5uN5PogKPwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6wjCk8lCzwPsHOigEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz232+0/HloGPdDFMAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz7D4R08VX0APWLfAAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz3VpHg+yjdXPv/vtQEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz21aDM+4FFFPf8taQEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz7xA8I+Js3APLJCwAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8ekz09/wCAO4pYgAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6M+Bc+w0zFPnlLngEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz6wE1o+ZDqkPmKcEAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAzzTVgM+lHxNPeXliwEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz40FGA+CuoYPyhv+gEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz2G/k8+odFoPqghMgEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz3ay7k+3jp+PmLpIAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAzxXwrE/FNC+PdgUQAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8Guis+gpfyPdMjqAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz4SKk8/GKfpPicEKAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz8M7K8+KOdMPOccAAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
 AAAAAkBZAAAAAAAAAAAAAz2nO5o/AK14PtRRxAEBAQAAAAAAAAAAAAAAAAAAAAAAAAAAAA==

(As you can see these are all pretty small for now, still ramping up traffic).

tdunning added a commit to tdunning/t-digest that referenced this issue Aug 2, 2017
lhgravendeel added a commit to ClutchWallet/stream-lib that referenced this issue May 21, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants