Add disk-cached CGC coefficients #8

lkdvos · 2023-11-27T17:03:35Z

This is my attempt at providing a thread-safe and performant way of having cached coefficients saved on disk.
Importantly, I avoid the use of JLD2, which should be slightly faster but less portable.
Additionally, this provides the ability to precompute a large number of CGCs in parallel, which can be useful on clusters.

Thoughts and comments welcome.

codecov · 2023-11-27T17:03:56Z

Codecov Report

Attention: 10 lines in your changes are missing coverage. Please review.

Comparison is base (baff207) 97.92% compared to head (ce49f95) 96.57%.

Files	Patch %	Lines
src/caching.jl	89.28%	9 Missing ⚠️
src/sector.jl	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master       #8      +/-   ##
==========================================
- Coverage   97.92%   96.57%   -1.35%     
==========================================
  Files           5        6       +1     
  Lines         529      613      +84     
==========================================
+ Hits          518      592      +74     
- Misses         11       21      +10

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

maartenvd · 2023-12-02T09:38:25Z

I think it looks good! How slow is it to re-open the file every time a new key is needed? Would it make sense to hold an open handle to the file?

lkdvos · 2023-12-04T08:30:07Z

I can run some benchmarks next week maybe. It seems to be quite fast and not so important, especially since by default the RAM cache keeps 1e5 elements stored, so every CGC is only loaded from disk once. Opening the file compared to computing it seems to be faster anyways.
In any case, I think this would really benefit from some additional use of the symmetries of the CGC's, for now it quite quickly fills up a big chunk of memory.
Somehow it would also be nice to use a format that is slightly more portable instead of the serialization, I think for running things on clusters etc this would be nice.

* using JLD2 to make files relocateable * always write new CGCs to disk * reading/writing is now necessarily blocking * `precompute_disk_data` now immediately writes results to disk * saves intermediate progress * reduces memory usage * `cache_info()` now gives more information TODO: - [ ] Update documentation/README - [ ] Profile SU(>4) for hanging issues - [ ] Use CGC symmetries to reduce memory usage (?) - [ ] Check for near-zero entries in CGCs

-> this would remove cached files every time tests are run

…en stored on disk twice

now splits the cache file into a directory structure, where each combination of s1 x s2 gets its own file. This way, once the file is written, it can be read in parallel, and by multiple processes, without any locking. This should also make it easier to clean up the cache, as the directory is now more human-readable as well

lkdvos · 2024-01-10T12:13:28Z

I still want to add the functionality to only store CGCs on disk for s1 x s2 if s1 < s2, to reduce the amount of storage space that is required, but I think this is already an improvement for caching

Project.toml

src/caching.jl

README.md

src/caching.jl

This reverts commit 9d80c9a.

Jutho

Looks great; some small final comments.

LocalPreferences.toml

README.md

src/SUNRepresentations.jl

Jutho · 2024-01-18T10:53:01Z

src/caching.jl

+Remove the CGC cache for ``SU(N)`` with eltype `T` from disk.
+"""
+function clear_disk_cache!(N, T=Float64)
+    fldrname = joinpath(CGC_CACHE_PATH, string(N), string(T))


I would maybe expect the behaviour that, if T is not specified, it removes all CGCs for N, irrespective of T, so essentially rm(joinpath(CGC_CACHE_PATH, string(N)), recursive=true).

src/caching.jl

test/caching.jl

lkdvos added 8 commits November 24, 2023 17:37

Print info for computing CGC

b78a013

Remove stray ;

2149b99

Add initial implementation of disk cache

3e65492

Rewrite cache to be thread-safe

8b1b724

small fixes and updates

576ec00

import SU to improve printing

8c5e3f2

Update README

b1a0b10

Fix type instabilities

437e25a

lkdvos requested a review from maartenvd November 27, 2023 17:03

lkdvos and others added 8 commits November 27, 2023 18:30

remove overly verbose output

f284457

add default values for arguments

a8dec09

Add tests for using cache functionality

289ef51

fix some tests

f5f52ef

Update caching.jl

1926c0e

Remove new syntax

8db19c9

Lower test requirements

5e6b5db

minor update to printing in precompute

6b6b26d

lkdvos added 5 commits December 20, 2023 18:19

Disable test of removing disk cache

575942e

-> this would remove cached files every time tests are run

Update README

c617cb4

Maybe fix bug where CGC is coefficient might be computed twice and th…

1478cf2

…en stored on disk twice

Merge branch 'master' into disk-cache

b95835e

lkdvos force-pushed the disk-cache branch from 9f6a350 to b95835e Compare January 9, 2024 10:59

Jutho reviewed Jan 11, 2024

View reviewed changes

Implement comments for Project.toml

cb4a927

lkdvos added 2 commits January 17, 2024 12:08

Remove unused RationalRoots dependency

9d80c9a

Revert "Remove unused RationalRoots dependency"

c49c286

This reverts commit 9d80c9a.

lkdvos force-pushed the disk-cache branch 2 times, most recently from a0f89af to 9e10313 Compare January 17, 2024 15:03

Refactor/cleanup and incorporate comments

5df2dcd

lkdvos force-pushed the disk-cache branch from 9e10313 to 5df2dcd Compare January 17, 2024 15:12

lkdvos added 4 commits January 17, 2024 16:38

Update README.md

e4d97a4

Remove unused/duplicate function

456e480

Small formatting update on cache_info()

fb244b4

remove unused function

f794e28

lkdvos force-pushed the disk-cache branch 2 times, most recently from c65ee90 to 01e5a98 Compare January 17, 2024 16:38

Add stub test for cache_info() clear_disk_cache!()

01e5a98

Jutho approved these changes Jan 18, 2024

View reviewed changes

lkdvos added 4 commits January 18, 2024 12:16

Remove LocalPreferences file

9048317

Clarify folder structure in README

76f1447

Improve error message for SUNIrrep(dynkin_labels)

82b5b4d

Incorporate comments

ce49f95

lkdvos merged commit 8342007 into master Jan 18, 2024
18 of 20 checks passed

lkdvos deleted the disk-cache branch January 18, 2024 15:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add disk-cached CGC coefficients #8

Add disk-cached CGC coefficients #8

lkdvos commented Nov 27, 2023

codecov bot commented Nov 27, 2023 •

edited

Loading

maartenvd commented Dec 2, 2023

lkdvos commented Dec 4, 2023

lkdvos commented Jan 10, 2024

Jutho left a comment

Jutho Jan 18, 2024

Add disk-cached CGC coefficients #8

Add disk-cached CGC coefficients #8

Conversation

lkdvos commented Nov 27, 2023

codecov bot commented Nov 27, 2023 • edited Loading

Codecov Report

maartenvd commented Dec 2, 2023

lkdvos commented Dec 4, 2023

lkdvos commented Jan 10, 2024

Jutho left a comment

Choose a reason for hiding this comment

Jutho Jan 18, 2024

Choose a reason for hiding this comment

codecov bot commented Nov 27, 2023 •

edited

Loading