-
Notifications
You must be signed in to change notification settings - Fork 695
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] New data structure for TTL #990
Comments
I also think it's worth comparing this against trying to just trying to simplify the expire dictionary. If we embed the TTL and key directly into the robj, we can have the expire dictionary become a hashset of all the items with an expiration. The hash would still be computed by the key. It's the same number of pointer dereferences to get to the TTL. |
Exactly Expire is a kvstore just like keys, so we change kvstore to use the swede table (sweet table?) at once so we don't have to duplicate kvstore. |
I would like to raise an alternative option to embedding the TTL in robj. At high level, Since general items are usually sds we can benefit from keeping an optional TTL embedded inside sds instead of robj.
@madolson I agree that optimizing the expiry algorithm to iterate sorted items might not be that important for active expiry implementation. I was thinking of using some compressed bitmap implementation as a potential candidate for marking hash buckets/chunks with items holding TTL but still at a very early stage of investigations. |
Yes Ran, this is possible. I'll make sure the abstractions are made in a way that allows changing this part. The first big thing is to replace the dict with a hashset of "valkey" objects. (A valkey object is an robj with a key, either embedded or at least with a key pointer embedded.) I would prefer that we don't do this TTL-in-sds with the current dict, because it would give me a mega-conflict. :) |
@madolson / @zuiderkwast just to circle to the top suggestion and clarify your point here: You meant when items are deleted OR added we will also have to update all the nodes which share the node right? so that would be updating at least d keys in the main dictionary to point to a new segment (given d=127 (128 degree -1))?
I am not sure if you meant that the inner nodes in the B-tree are different than the leaf nodes? I think that maybe in order to minimize the main dictionary updates we can use B+ tree so that the inner nodes will only keep 8 bytes TTL bytes and split/merges will not require main dictionary updates when internal nodes are split or merged. |
Interesting. I'm not that familiar with B+ trees and where you got 42 bits from. I'd like to know more about it. |
Yeh At some point I thought we might use less bits (like we did in LRU) and keep some reference time when the server starts. But that can be discussed at another time (currently changes the comment before to mark 8 bytes) |
Originally posted by @madolson in #169 (comment)
I'm not convinced we want a RAX, I was thinking about something more just like a B tree that was more purpose built for TTLs. My thought was each node in the tree would have four pointers to different sub-elements. The sub-elements are either TTL segments (255 pointers to Valkey objects + metadata) or more nodes. We split nodes as they fill up. Once the nodes get to their max granularity (48 bits of TTL precision) we start chaining segments together.
Segments are limited to 255 items, so that we can keep a reference to them in the valkey object (48 bit pointer + 8 bits to lookup into the segment) so that we can find the element. We have 8 extra bits for future use. When we insert more items into the same segment, we may eventually need to split them, so this bounds the overhead spent rehashing these segments.
We are always keeping track of the node and segment with the nearest expiration, so we can semi-efficiently scan for items that are in the nearest segment. The more items in Valkey, the tighter the bound over the TTL as the segments will get more precise.
Volatile TTL will work well, as we will have the nearest buck to evict from. Volatile LRU and volatile random will work less well. We might consider just picking a random segment and evicting a random item or the item with the worst LRU.
My slightly more detailed idea:
The text was updated successfully, but these errors were encountered: