Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix redis expiries #93

Open
lajarre opened this issue Jul 28, 2022 · 7 comments · Fixed by #104
Open

Fix redis expiries #93

lajarre opened this issue Jul 28, 2022 · 7 comments · Fixed by #104
Labels
bug Something isn't working

Comments

@lajarre
Copy link
Member

lajarre commented Jul 28, 2022

We have the following problems:

  • treasuries is often evicted because of memory limits and we are just not making use of it.
  • whitelist is often evicted and thus whitelists are probably fetch too many times for nothing.
  • The covalent_transfer_items, covalent_treasury, bitquery_eth and covalent_prices hashmaps are getting expired often, because of memory limits. Whenever one of these hashes is picked by the eviction algorithm, the entirety of the hash is evicted (and not the oldest keys).

Let's break things down by kind of data:

Treasuries

Expectations:

  • Append new treasuries to the set whenever a new request is done.
  • Regularly evict old treasuries.

Solution:

  • Host the treasuries key in a Redis instance that doesn't have a broad eviction rule.
  • Whenever a new Treasury is added, create a key with the address of the treasury (like treasury_{addr}, with empty string value and an expiry of, eg, 14 days.
  • Whenever a Treasury is accessed (but already in Redis), update the expiry to now + 14 days.

Whitelist

Expectations:

  • Update the Redis whitelist object regularly, once a day. But not more often, as this would be useless use of bandwidth and create I/O delays.
  • We don't want this to be append-only, but to be replaced entirely on update.

Solution:

  • Host the whitelist key in a Redis instance that doesn't have a broad eviction rule (same as treasuries).
  • Define expiry to 1d.
  • Replace the whole key contents on each update (do not use sadd, we could also just use a simple raw text key).

Data

For all other pieces of data.

Expectations:

  • All pieces of data should have an expiry of 1 day at creation (because our data is stale after 1 day anyways).
  • Set a broad eviction rule so as to make sure the instance stays up whenever memory runs low.
  • When the eviction algorithm is run, it should evict only some pieces of data, not all of them at once.

Solution:

  • Use a Redis instance with allkeys-lru.
  • Split all the covalent_transfer_items, covalent_treasury, bitquery_eth and covalent_prices hashes in single entries, like covalent_transfer_items_{treasury_address}_{contract_address}_{chain_id}_{date}, etc.
  • Define an expiry of 1 day at creation.
@lajarre lajarre added the bug Something isn't working label Jul 28, 2022
@acemasterjb
Copy link
Contributor

acemasterjb commented Jul 29, 2022

Hey @lajarre, to your point about

Whenever a new Treasury is added, create a key with the address of the treasury (like treasury_{addr}, with empty string value and an expiry of, eg, 14 days.

What exactly do you mean by an empty string value?

Also, for covalent_transfer_items, covalent_treasury, bitquery_eth and covalent_prices hashes, wouldn't defining an expiry of one day on those hashes work just as well?

@lajarre
Copy link
Member Author

lajarre commented Jul 29, 2022

What exactly do you mean by an empty string value?

Our code just need to look up a list of treasury addresses, which it would do with db.keys(). No need for any specfic value, empty string is good enough.

db.keys() would look like:

whitelist
treasury_0x0001
treasury_0x1111
treasury_0x1222
...

Also, for covalent_transfer_items, covalent_treasury, bitquery_eth and covalent_prices hashes, wouldn't defining an expiry of one day on those hashes work just as well?

As mentioned, our code expects that all pieces of data need an expiry of 1 day and that they get evicted one by one whenever the instance is short on memory. What I mean by "piece of data", is for example, any of the Redis hash values (think dict value) for the covalent_prices hash (and not the whole hash). You don't want to evict the whole hash whenever memory runs low.

@acemasterjb
Copy link
Contributor

acemasterjb commented Aug 4, 2022

As mentioned, our code expects that all pieces of data need an expiry of 1 day and that they get evicted one by one whenever the instance is short on memory.

I get that.... but if our scheduler runs everyday on each treasury in our treasury list, meaning that each of the "data" keys are set each day at the same time, then wouldn't it make sense to do something like:

for treasury in treasuries
    # get `transfers_for_today` for `asset` which `treasury.address` holds
    db.hset("covalent_transfer_items", f"{treasury.address}_{asset}_1_{today}", transfers_for_today)
    # do this for all `treasuries`...
tomorrow_at_midnight = (datetime.now() + timedelta(day=1)).replace(hours=0, minutes=0, seconds=0)
db.expireat("covalent_transfer_items", tomorrow_at_midnight)

This would essentially flush the covalent_transfer_items key everyday at midnight way before the reload_treasuries_stats task runs (right now this task runs daily at 00:30 UTC).

Note, obviously the code block above is not how its being done in whip, but hopefully it illustrates the point well enough.

By the way, balance data has timestamps that are always set to midnight on a given date.

@lajarre
Copy link
Member Author

lajarre commented Aug 8, 2022

This would essentially flush the covalent_transfer_items key everyday at midnight way before the reload_treasuries_stats task runs (right now this task runs daily at 00:30 UTC).

This piece of code is a great approach, as it defines an unambiguous and optimized expiry time. That's better than defaulting to the Redis instance eviction rule.

On the other hand, it doesn't help with granularity of eviction (and that's apparent in the code: you are setting expiries after the for loop is done). Our code still expects that each (treasury, asset, day) tuple is be evicted alone when memory runs low. The expectation doesn't play well with the hash structure.

@acemasterjb
Copy link
Contributor

I think I understand what you mean now. I also see what you meant for the solution to the data keys 😅.

From what I understand, the keys for covalent_transfer_items_{treasury_address}_{contract_address}_{chain_id}_{date} will still have a value containing the "transfer items" and these keys will have an independent expiry time. That works.

There is also another set of data keys that are not addressed in the original post for this issue:

  • asset_hist_performance
  • asset_correlations
  • balances
    These keys I think can get away with having a blanket key expiry. Also, on that note, shouldn't we be using these keys???

@lajarre
Copy link
Member Author

lajarre commented Sep 19, 2022

From a high-level perspective I believe all the bugs and improvements listed in this issue are not fulfilled yet, so I'm reopening.

@lajarre lajarre reopened this Sep 19, 2022
@acemasterjb
Copy link
Contributor

Agreed, forgot about the auto-close 😅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants