Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solve perf issue serializing large collections #239

Merged
merged 1 commit into from
Jan 29, 2024

Conversation

DavyLandman
Copy link
Member

We have a stackless structured visitor that doesn't require large call stacks to deal with deeply nested ASTs.

While this worked great for deep ASTs, it doesn't work great for big flat collections (aka wide).

Big collections would require the same amount of memory, just to prepare the stack with all the entries. Now we have iterating entries on the stack, that can be "returned to" multiple times.

Note, I had to change the tests, as they assumed reverse orders for sets & maps in the visitor, which was not an actual requirement.

@DavyLandman DavyLandman force-pushed the faster-serialize-large-collections branch from be73eff to ab905ed Compare January 27, 2024 09:00
Copy link

github-actions bot commented Jan 27, 2024

Test Results

     64 files   -      32       64 suites   - 32   4m 5s ⏱️ - 2m 28s
242 291 tests ±      0  242 290 ✅ ±      0  1 💤 ±0  0 ❌ ±0 
484 640 runs   - 242 320  484 638 ✅  - 242 319  2 💤  - 1  0 ❌ ±0 

Results for commit 3876e84. ± Comparison against base commit 8f94c94.

♻️ This comment has been updated with latest results.

@DavyLandman DavyLandman force-pushed the faster-serialize-large-collections branch from ab905ed to d1102a2 Compare January 27, 2024 11:17
We have a stackless structured visitor that doesn't require large call stacks to deal with deeply nested ASTs.

While this worked great for deep ASTs, it doesn't work great for big flat collections (aka wide).

Big collections would require the same amount of memory, just to prepare the stack with all the entries.
Now we have iterating entries on the stack, that can be "returned to" multiple times.
@DavyLandman DavyLandman force-pushed the faster-serialize-large-collections branch from d1102a2 to 3876e84 Compare January 28, 2024 10:01
@jurgenvinju
Copy link
Member

Ok nice. I was halfway something very similar; you beat me to it ♥️

@jurgenvinju
Copy link
Member

looks great

@jurgenvinju jurgenvinju merged commit 5b9c1db into main Jan 29, 2024
8 of 9 checks passed
@jurgenvinju jurgenvinju deleted the faster-serialize-large-collections branch January 29, 2024 08:58
@DavyLandman
Copy link
Member Author

thanks, the solution came to me the middle of brushing my teeth, then it took 30min to implement it.

I'll add synthetic benchmark results soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants