Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: universal compaction condition #97

Merged
merged 3 commits into from
Nov 13, 2024
Merged

Conversation

skyzh
Copy link
Owner

@skyzh skyzh commented Oct 1, 2024

close #96

the original condition was not very correct and would produce larger write amplification, so we make it exactly the same as RocksDB in this PR. TODO: fix doc and tests.

@skyzh skyzh mentioned this pull request Oct 1, 2024
@silver-ymz
Copy link

I have another question. In Reduce Sorted Runs parts, RocksDB will compact at most max_merge_width tiers into one tier (docs and example). This also seems to reduce write amplification significantly.

cargo run --bin compaction-simulator tiered --dump-real-id --num-tiers 8 --iterations 200 (parameters same as RocksDB):

this PR

=== Iteration 199 ===
--- After Compaction ---
L1716 (42): [...]
L808 (6): [808, 809, 810, 811, 812, 813]
L787 (10): [787, 788, 789, 790, 791, 792, 793, 794, 795, 796]
L753 (15): [...]
L703 (21): [...]
L634 (28): [...]
L501 (78): [...]
--- Statistics ---
Write Amplification: 1757/200=8.785x
Maximum Space Usage: 242/200=1.210x
Read Amplification: 7x

Full Output: pr.txt

my impl similar to RocksDB:

=== Iteration 199 ===
--- After Flush ---
L753 (1): [753]
L748 (5): [748, 749, 750, 751, 752]
L722 (21): [...]
L653 (28): [...]
L453 (145): [...]
--- Statistics ---
Write Amplification: 753/200=3.765x
Maximum Space Usage: 290/200=1.450x
Read Amplification: 5x

Full Output: self.txt

@skyzh
Copy link
Owner Author

skyzh commented Nov 12, 2024

The reason I didn't merge this patch is that I don't fully understand why RocksDB produces that sequence -- I think we got the first few lines correct, but looking at the example,

https://github.com/facebook/rocksdb/wiki/Universal-Style-Compaction-Example

it allows > 8 tiers

1 2 3 4 5 6 7 55   (after flush)
1 1 2 3 4 5 6 7 55   (after flush)
29 55   (after compaction)
1 29 55   (after flush)

before doing the compaction, which gives 29; while in my implementation, it gives 28

Looking at https://github.com/facebook/rocksdb/wiki/universal-compaction,

1 16  (no compaction triggered)
1 1 16  (no compaction triggered)
1 1 1 16  (no compaction triggered)
1 1 1 1 16  => 4 16
1 4 16  (no compaction triggered)
1 1 4 16  (no compaction triggered)
1 1 1 4 16 => 3 4 16
1 3 4 16  (no compaction triggered)
1 1 3 4 16 => 2 3 4 16
1 2 3 4 16  (no compaction triggered)
1 1 2 3 4 16 => 11 16

it also indicates the same -- even if it triggers at tier == 5, the number of tiers can stay at 5 in some conditions.

so I'd like to reproduce their algorithm and check if it reduces write amplification (my intuition is yes b/c they do fewer compactions).

@skyzh
Copy link
Owner Author

skyzh commented Nov 13, 2024

after looking at it for a while, I assume it's a typo, though I suppose their example should be generated with some tools, but I couldn't figure out a reasonable logic for that 🤪

@skyzh
Copy link
Owner Author

skyzh commented Nov 13, 2024

I've updated my implementation to take max_merge_with for reducing sorted runs and I'll merge the patch; meanwhile, I'll update the docs in another patch. Thanks for the suggestion anyways :)

Signed-off-by: Alex Chi <[email protected]>
@skyzh skyzh merged commit 9cca6ad into main Nov 13, 2024
1 check passed
@skyzh skyzh deleted the skyzh/fix-universal-compaction branch November 13, 2024 02:27
@silver-ymz
Copy link

The reason I didn't merge this patch is that I don't fully understand why RocksDB produces that sequence -- I think we got the first few lines correct, but looking at the example,

I also guess it's a typo. I asked it at mail channel, but no response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

document error in week2 day3
2 participants