-
Hi Caffeine Developer(s), A quick question regarding the TinyLFU implementation: I see from the TinyLFU paper that upon an increment, only the minimal counters are incremented and the rest are untouched (Section 3.2 of https://arxiv.org/pdf/1512.00727.pdf). In the Caffeine implementation, I cannot see where this happens. It seems like we are incrementing all 4 counters by calling Please correct me if I missed something. Thank you, |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
I believe @gilga1983 intended to use that scheme in his The simulator has an option to evaluate that approach, as well as a perfect histogram. In my analysis it did not make a noticeable improvement to the hit rate (noise) so the small cost was not worthwhile. You are welcome to perform a fresh analysis to see the impact. caffeine/simulator/src/main/resources/reference.conf Lines 179 to 185 in 3f4c159 My conclusion was that it is an excellent idea when the sketch's accuracy is of significant importance for application correctness. A cache is best effort to improve performance, where TinyLFU only needs to consider relative value rather than report statistics externally. If both candidates are poor than a slightly less accurate choice still evicts an item that has a low probability for reuse, and similarly if both are very popular than they were both likely to be worth keeping. At very similar frequencies it is a toss-up, so Gil's insight was actually to separate the wheat from the chaff by retaining the heavy hitters instead of cache pollutants. Therefore a small accuracy improvement won't dramatically change our decisions and hit rates over a long run since we care about large difference in frequencies, halve the counters periodically, use small saturating counters, and recency ordering to pick the best victims. Later on we introduced an hash flooding protection and an adaptive window, which would further cloud a straightforward analysis of whether improved accuracy could be beneficial. You are welcome to revisit this and suggest improvements, there is always something that could be done better! |
Beta Was this translation helpful? Give feedback.
I believe @gilga1983 intended to use that scheme in his
GuessingBloomFilter
, butNeedToIncrement
is alwaystrue
(sources). It was likely mentioned for background material and to help alleviate fears about a sketch's accuracy. As readers might take their new knowledge and apply it elsewhere, awareness of this boosting mechanism would avoid them suffering lower quality if their use-cases were less tolerant to error.The simulator has an option to evaluate that approach, as well as a perfect histogram. In my analysis it did not make a noticeable improvement to the hit rate (noise) so the small cost was not worthwhile. You are welcome to perform a fresh analysis to see the impact.
caffeine…