Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Preallocate records #1185

Merged
merged 2 commits into from
Jan 8, 2025
Merged

perf: Preallocate records #1185

merged 2 commits into from
Jan 8, 2025

Conversation

zlangley
Copy link
Contributor

@zlangley zlangley commented Jan 7, 2025

No description provided.

@zlangley zlangley marked this pull request as ready for review January 7, 2025 18:52

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

}

impl Default for MemoryConfig {
fn default() -> Self {
Self::new(29, 1, 29, 29, 17, 64)
Self::new(29, 1, 29, 29, 17, 64, 1 << 24)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

224 seems pretty high, maybe just 220?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is for memory. If a chip has up to 2^20 records and we have dozens of chips and each record does several accesses, then 2^24 might still be small?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you lmk the execution performance diff (via criterion bench) between 2^20 and 2^24, just for a sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from 1 << 20 to 1 << 24 (regex_execute bench):

regex/execute           time:   [377.90 ms 391.66 ms 405.92 ms]
                        change: [-11.309% -6.3323% -0.8434%] (p = 0.05 < 0.05)
                        Change within noise threshold.

on the border of statistical signficance

@@ -201,7 +201,7 @@ where
Self {
adapter,
core,
records: vec![],
records: Vec::with_capacity(1 << 20),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should use access_capacity

Copy link
Contributor Author

@zlangley zlangley Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have any suggestions for how to access config from here?

(Also I think this is a different parameter than access_capacity; access_capacity is the number of memory record IDs, which should be several factors larger than any one chip's records.len(), both because every chip contributes to memory records and because each chip record corresponds to several memory records.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we could add it as a field in OfflineMemory...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems weird because this is more about the number of instructions executed by any one chip rather than anything memory-related.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok we can address later then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least make it a private constant

Copy link
Contributor

@jonathanpwang jonathanpwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM but leave some todos and also make all unavoidable constants in const

Copy link

github-actions bot commented Jan 8, 2025

group app.proof_time_ms app.cycles app.cells_used leaf.proof_time_ms leaf.cycles leaf.cells_used
verify_fibair (-2510 [-64.1%]) 1,405 (-551287 [-73.9%]) 195,205 (-21986042 [-73.3%]) 8,028,582 - - -
fibonacci_program 6,155 1,500,137 51,505,102 - - -
regex_program (+263 [+1.4%]) 18,965 4,190,904 165,028,173 - - -
ecrecover_program (-32 [-1.2%]) 2,592 285,169 15,074,875 - - -

Commit: 6471801

Benchmark Workflow

@zlangley zlangley merged commit d5b52a2 into main Jan 8, 2025
22 checks passed
@zlangley zlangley deleted the perf/preallocate branch January 8, 2025 19:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants