-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add disk-cached CGC coefficients #8
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #8 +/- ##
==========================================
- Coverage 97.92% 96.57% -1.35%
==========================================
Files 5 6 +1
Lines 529 613 +84
==========================================
+ Hits 518 592 +74
- Misses 11 21 +10 ☔ View full report in Codecov by Sentry. |
I think it looks good! How slow is it to re-open the file every time a new key is needed? Would it make sense to hold an open handle to the file? |
I can run some benchmarks next week maybe. It seems to be quite fast and not so important, especially since by default the RAM cache keeps 1e5 elements stored, so every CGC is only loaded from disk once. Opening the file compared to computing it seems to be faster anyways. |
* using JLD2 to make files relocateable * always write new CGCs to disk * reading/writing is now necessarily blocking * `precompute_disk_data` now immediately writes results to disk * saves intermediate progress * reduces memory usage * `cache_info()` now gives more information TODO: - [ ] Update documentation/README - [ ] Profile SU(>4) for hanging issues - [ ] Use CGC symmetries to reduce memory usage (?) - [ ] Check for near-zero entries in CGCs
-> this would remove cached files every time tests are run
…en stored on disk twice
now splits the cache file into a directory structure, where each combination of s1 x s2 gets its own file. This way, once the file is written, it can be read in parallel, and by multiple processes, without any locking. This should also make it easier to clean up the cache, as the directory is now more human-readable as well
I still want to add the functionality to only store CGCs on disk for |
a0f89af
to
9e10313
Compare
c65ee90
to
01e5a98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great; some small final comments.
Remove the CGC cache for ``SU(N)`` with eltype `T` from disk. | ||
""" | ||
function clear_disk_cache!(N, T=Float64) | ||
fldrname = joinpath(CGC_CACHE_PATH, string(N), string(T)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would maybe expect the behaviour that, if T
is not specified, it removes all CGCs for N, irrespective of T
, so essentially rm(joinpath(CGC_CACHE_PATH, string(N)), recursive=true)
.
This is my attempt at providing a thread-safe and performant way of having cached coefficients saved on disk.
Importantly, I avoid the use of JLD2, which should be slightly faster but less portable.
Additionally, this provides the ability to precompute a large number of CGCs in parallel, which can be useful on clusters.
Thoughts and comments welcome.