[WiP] Bitwise versions of gist_cmp & common_ancestor functions for GiST index #40

AbelVM · 2020-08-14T14:19:59Z

Dropping the calls to high level H3 functions to avoid redundant operations

This is the first time in like, 20 years I play with C, so please take it more as pseudocode than actual code. These functions have been migrated from my own PL/pgSQL functions, so I am sure I've lost a lot of C optimizations on the way or the syntax is as ugly as it can be. I am not ashamed of that ¯\(ツ)/¯, just word to the wise.

I just took into account the corner cases to avoid wasting CPU cycles and turn all the high-level H3 functions calls into simple bitwise operations that lower the redundant operations

I have benchmarked the PL/pgSQL versions of these functions and they take ~ 0.006 ms per call, avg

So, the GiST related operations should be way faster with this approach

Rebased with tag 3.6.5

No conflicts, but....

Not tested at all!!

Friday. Lunchtime. No testing environment available right now. This is just a comment to this issue that looked better as a PR 😝

update

… feature-more-operators

zachasme · 2020-08-24T08:39:04Z

Thanks for working on this @AbelVM 😃

Using the bitwise versions is definitely the way to go!

I think the failing tests are mainly due to not importing h3index.h in opclass_gist.c, and also it seems PG_RETURN_H3INDEX didn't make the rebase.

zachasme · 2020-08-24T10:22:52Z

I've rebased the feature-more-operators branch on master, which fixes some of the errors you had.

AbelVM · 2020-08-24T10:46:04Z

Rebased with your rebase 😝

zachasme · 2020-08-24T14:18:07Z

I've fixed the warnings/errors and merged your code into #42.

I've split the old branch into two, so we can get GiST working and merged before moving on to SP-GiST. We still need tests and benchmarks.

Thank you for this PR 👍

AbelVM · 2020-08-24T15:28:18Z

For reference, for testing and benchmarking my sql function (no smoke or unit tests here :) ), I created synthetic datasets with randon locations and several depths kins upwards and downwards for each location ({number_of_places} and {number_of_depths}), and then checked my custom h3_contains function

CREATE TABLE random_cells AS
WITH pre AS (
    SELECT
        t.i,
        random() * 360 - 180 AS x,
        random() * 180 - 90 AS y,
        (round(random() * (15 - 1 -{number_of_depths})) + {number_of_depths}))::int AS r
    FROM
        generate_series(1, {number_of_places}) AS t (i)
),
a AS (
    SELECT
        h3_geo_to_h3 (st_setsrid (st_point (x, y), 4326), r) AS h3_id,
        i::text AS place,
        r
    FROM
        pre
)
SELECT
    *
FROM
    a
UNION
SELECT
    h3_to_parent (h3_id, r - {number_of_depths}) AS h3_id,
    place,
    r - {number_of_depths} AS r
FROM
    a
UNION
SELECT
    h3_to_children (h3_id, r + {number_of_depths}) AS h3_id,
    place,
    r + {number_of_depths} AS r
FROM
    a;

-- GiST-indexing stuff  should be here:)

-- Validity check and benchmark
SELECT
    c1.place || '-' || c2.place AS a_b,
    c1.r || '-' || c2.r AS resa_resb,
    h3_contains (c1.h3_id, c2.h3_id) AS a_contains_b
FROM
    b AS c1
CROSS JOIN 
    b AS c2
ORDER BY
    1,2;

My results, time in ms

But to test that the index actually kicks in, I'd go with a big dataset. Something like, using HD geometries

[dataset A] polyfilling a country at high resolution
indexing A and so :)
[dataset B] polyfilling a province of the same country at same resolution and then h3_compact the results
find the cells in A contained in the cells of B

Results validity can be easily verified with QGIS and the query time should be, imho, in the same or lower order than ST_Intersection(country, province)

zachasme and others added 5 commits October 15, 2019 12:52

WIP: GiST index

6c9acd7

Merge pull request #1 from bytesandbrains/master

fa61641

update

WIP: GiST index

78e3a1c

Merge branch 'feature-more-operators' of github.com:AbelVM/h3-pg into…

c610309

… feature-more-operators

bitwise versions of gist_cmp & common_ancestor

8adf782

AbelVM mentioned this pull request Aug 14, 2020

Add GiST and SP-GiST support #5

Closed

zachasme force-pushed the feature-more-operators branch from 6c9acd7 to 6db18d2 Compare August 24, 2020 09:43

rebased

d814c33

zachasme mentioned this pull request Aug 24, 2020

GiST support #42

Draft

zachasme closed this Aug 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WiP] Bitwise versions of gist_cmp & common_ancestor functions for GiST index #40

[WiP] Bitwise versions of gist_cmp & common_ancestor functions for GiST index #40

AbelVM commented Aug 14, 2020

zachasme commented Aug 24, 2020

zachasme commented Aug 24, 2020

AbelVM commented Aug 24, 2020

zachasme commented Aug 24, 2020

AbelVM commented Aug 24, 2020

[WiP] Bitwise versions of gist_cmp & common_ancestor functions for GiST index #40

[WiP] Bitwise versions of gist_cmp & common_ancestor functions for GiST index #40

Conversation

AbelVM commented Aug 14, 2020

Dropping the calls to high level H3 functions to avoid redundant operations

Rebased with tag 3.6.5

Not tested at all!!

zachasme commented Aug 24, 2020

zachasme commented Aug 24, 2020

AbelVM commented Aug 24, 2020

zachasme commented Aug 24, 2020

AbelVM commented Aug 24, 2020