Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize LFU #438

Merged
merged 8 commits into from
Nov 8, 2023
Merged

optimize LFU #438

merged 8 commits into from
Nov 8, 2023

Conversation

bitfaster
Copy link
Owner

@bitfaster bitfaster commented Oct 23, 2023

Optimizations:

  • Successive buffer reads/writes don't need two volatile ops. They can piggyback.
  • Closely match Caffeine volatile operations with regard to drain status.
  • Inline drain status
  • Inline EvictIterator

This is definitely faster, but less stable during the benchmark run - fluctuating between 23 ns/op - 33 ns/op. Gain is most consistent on the evict throughput.

image
image

@coveralls
Copy link

coveralls commented Oct 23, 2023

Coverage Status

coverage: 98.388%. remained the same
when pulling f638359 on users/alexpeck/lfuperf
into 4d52e6f on main.

@bitfaster
Copy link
Owner Author

Method Runtime Mean Error StdDev Ratio Code Size Allocated
ConcurrentDictionary .NET 6.0 7.316 ns 0.1773 ns 0.2111 ns 1.00 1,523 B -
FastConcurrentLru .NET 6.0 9.257 ns 0.2104 ns 0.2583 ns 1.27 7,235 B -
ConcurrentLru .NET 6.0 15.503 ns 0.3292 ns 0.3522 ns 2.12 7,482 B -
AtomicFastLru .NET 6.0 21.149 ns 0.4414 ns 0.4533 ns 2.90 NA -
FastConcurrentTLru .NET 6.0 11.973 ns 0.1564 ns 0.1387 ns 1.63 6,394 B -
ConcurrentTLru .NET 6.0 16.881 ns 0.3086 ns 0.2577 ns 2.31 7,912 B -
ConcurrentLfu .NET 6.0 26.261 ns 0.5480 ns 1.0818 ns 3.56 NA -
ClassicLru .NET 6.0 44.895 ns 0.6735 ns 0.6300 ns 6.14 NA -
RuntimeMemoryCacheGet .NET 6.0 130.571 ns 2.6094 ns 4.7715 ns 17.89 49 B 32 B
ExtensionsMemoryCacheGet .NET 6.0 50.109 ns 0.9074 ns 0.8488 ns 6.86 78 B 24 B
ConcurrentDictionary .NET Framework 4.8 14.423 ns 0.3010 ns 0.3091 ns 1.00 4,127 B -
FastConcurrentLru .NET Framework 4.8 15.579 ns 0.3166 ns 0.2962 ns 1.08 23,882 B -
ConcurrentLru .NET Framework 4.8 21.279 ns 0.3536 ns 0.3307 ns 1.48 24,186 B -
AtomicFastLru .NET Framework 4.8 32.816 ns 0.6822 ns 0.7006 ns 2.28 358 B -
FastConcurrentTLru .NET Framework 4.8 48.056 ns 0.8575 ns 0.9875 ns 3.33 24,154 B -
ConcurrentTLru .NET Framework 4.8 49.802 ns 0.5524 ns 0.4897 ns 3.46 24,506 B -
ConcurrentLfu .NET Framework 4.8 59.694 ns 0.8527 ns 0.7976 ns 4.15 NA -
ClassicLru .NET Framework 4.8 52.933 ns 0.7984 ns 0.7468 ns 3.68 NA -
RuntimeMemoryCacheGet .NET Framework 4.8 290.286 ns 4.8405 ns 4.5278 ns 20.17 33 B 32 B
ExtensionsMemoryCacheGet .NET Framework 4.8 96.315 ns 1.1317 ns 1.0032 ns 6.68 82 B 24 B

@bitfaster bitfaster marked this pull request as ready for review October 24, 2023 00:26
@bitfaster bitfaster changed the title remove Lfu half fence reads/writes optimize LFU Nov 2, 2023
@bitfaster
Copy link
Owner Author

Method Runtime Mean Error StdDev Ratio Code Size Allocated
ConcurrentDictionary .NET 6.0 7.069 ns 0.0174 ns 0.0154 ns 1.00 1,523 B -
FastConcurrentLru .NET 6.0 8.107 ns 0.0165 ns 0.0154 ns 1.15 7,235 B -
ConcurrentLru .NET 6.0 14.603 ns 0.0474 ns 0.0420 ns 2.07 7,482 B -
AtomicFastLru .NET 6.0 20.145 ns 0.0250 ns 0.0222 ns 2.85 NA -
FastConcurrentTLru .NET 6.0 11.630 ns 0.0530 ns 0.0470 ns 1.65 7,648 B -
ConcurrentTLru .NET 6.0 16.265 ns 0.1540 ns 0.1441 ns 2.30 7,912 B -
ConcurrentLfu .NET 6.0 27.959 ns 1.0704 ns 3.1562 ns 4.34 NA -
ClassicLru .NET 6.0 43.313 ns 0.1050 ns 0.0982 ns 6.13 NA -
RuntimeMemoryCacheGet .NET 6.0 122.464 ns 0.5272 ns 0.4932 ns 17.32 49 B 32 B
ExtensionsMemoryCacheGet .NET 6.0 53.327 ns 0.5192 ns 0.4857 ns 7.54 78 B 24 B
ConcurrentDictionary .NET Framework 4.8 13.612 ns 0.0637 ns 0.0596 ns 1.00 4,127 B -
FastConcurrentLru .NET Framework 4.8 14.711 ns 0.0511 ns 0.0426 ns 1.08 23,882 B -
ConcurrentLru .NET Framework 4.8 18.763 ns 0.0454 ns 0.0379 ns 1.38 24,186 B -
AtomicFastLru .NET Framework 4.8 31.132 ns 0.2244 ns 0.1990 ns 2.29 358 B -
FastConcurrentTLru .NET Framework 4.8 46.537 ns 0.9263 ns 0.8665 ns 3.42 24,154 B -
ConcurrentTLru .NET Framework 4.8 49.761 ns 0.0502 ns 0.0469 ns 3.66 24,506 B -
ConcurrentLfu .NET Framework 4.8 57.914 ns 1.1592 ns 1.6625 ns 4.22 NA -
ClassicLru .NET Framework 4.8 52.861 ns 0.5435 ns 0.4818 ns 3.88 NA -
RuntimeMemoryCacheGet .NET Framework 4.8 284.650 ns 1.8158 ns 1.6985 ns 20.91 33 B 32 B
ExtensionsMemoryCacheGet .NET Framework 4.8 90.598 ns 0.1461 ns 0.1367 ns 6.66 82 B 24 B

@bitfaster
Copy link
Owner Author

bitfaster commented Nov 8, 2023

Method Runtime Mean Error StdDev Ratio Code Size Allocated
ConcurrentDictionary .NET 6.0 6.853 ns 0.0829 ns 0.0775 ns 1.00 1,523 B -
FastConcurrentLru .NET 6.0 8.164 ns 0.0279 ns 0.0247 ns 1.19 7,235 B -
ConcurrentLru .NET 6.0 14.654 ns 0.0570 ns 0.0476 ns 2.14 7,482 B -
AtomicFastLru .NET 6.0 19.622 ns 0.0529 ns 0.0441 ns 2.87 NA -
FastConcurrentTLru .NET 6.0 11.401 ns 0.0637 ns 0.0596 ns 1.66 7,648 B -
ConcurrentTLru .NET 6.0 16.166 ns 0.0323 ns 0.0270 ns 2.36 7,912 B -
ConcurrentLfu .NET 6.0 26.350 ns 0.9923 ns 2.9257 ns 4.45 NA -
ClassicLru .NET 6.0 44.174 ns 0.7162 ns 0.6348 ns 6.44 NA -
RuntimeMemoryCacheGet .NET 6.0 118.433 ns 2.4028 ns 4.0801 ns 17.73 49 B 32 B
ExtensionsMemoryCacheGet .NET 6.0 57.510 ns 0.2869 ns 0.2396 ns 8.40 78 B 24 B
ConcurrentDictionary .NET Framework 4.8 13.544 ns 0.0377 ns 0.0315 ns 1.00 4,127 B -
FastConcurrentLru .NET Framework 4.8 15.409 ns 0.0900 ns 0.0752 ns 1.14 23,882 B -
ConcurrentLru .NET Framework 4.8 19.143 ns 0.0659 ns 0.0584 ns 1.41 24,186 B -
AtomicFastLru .NET Framework 4.8 29.700 ns 0.1322 ns 0.1236 ns 2.19 358 B -
FastConcurrentTLru .NET Framework 4.8 46.632 ns 0.9553 ns 1.2754 ns 3.43 24,154 B -
ConcurrentTLru .NET Framework 4.8 50.217 ns 0.0861 ns 0.0763 ns 3.71 24,506 B -
ConcurrentLfu .NET Framework 4.8 59.962 ns 1.1922 ns 1.2757 ns 4.40 NA -
ClassicLru .NET Framework 4.8 51.852 ns 0.1799 ns 0.1595 ns 3.83 NA -
RuntimeMemoryCacheGet .NET Framework 4.8 289.248 ns 2.0524 ns 1.8194 ns 21.37 33 B 32 B
ExtensionsMemoryCacheGet .NET Framework 4.8 92.017 ns 0.4315 ns 0.3603 ns 6.79 82 B 24 B

This is probably 23 ns/op is 2/3 ns/op faster than seen previously, but it does not run consistently enough for the overall bench to produce such a result.

image

...

image

@bitfaster bitfaster merged commit a99aa78 into main Nov 8, 2023
14 checks passed
@bitfaster bitfaster deleted the users/alexpeck/lfuperf branch November 8, 2023 04:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants