Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache CG coefficients #71

Closed

Conversation

olivier-peltre
Copy link

Right now, Clebsch-Gordan coefficients are recomputed at every tensor product call.
This means they are also recomputed e.g. every time N scalars are used to gate N irreps with IrrepsArray.__mul__, as it calls elementwise_tensor_product, i.e. at every MACE message-passing step, symmetric contraction step, etc.

The torch version of e3nn wraps CG coefs in functools.cache:
https://github.com/e3nn/e3nn/blob/ac3528f7fb5fe1a8838f1df087d2eefe60d91ea8/e3nn/o3/_wigner.py#L149

NB functools.lru_cache(maxsize=None) is an alias for functools.cache

NB it might also be more efficient to use a dense/spare array format than relying on a generic hashtable

@mariogeiger
Copy link
Member

The anger of @functools.cache is the following

@functools.cache
def f(n: int) -> np.ndarray:
    return np.array(n)


x = f(1)
x += 41
print(f(1))  
# prints 42

This is why I did not cache the function.
Now that I think about it we can protect it like that:

@functools.cache
def _f(n: int) -> np.ndarray:
    return np.array(n)


def f(n: int) -> np.ndarray:
    return _f(n).copy()


x = f(1)
x += 41
print(f(1))
# prints 1

Please do something like that

@olivier-peltre
Copy link
Author

olivier-peltre commented Jun 4, 2024

Thanks for the quick reply!
I'm not sure to understand why you would ever want to perform in-place operations on tabulated constants like CG coefficients 🤔

@mariogeiger
Copy link
Member

mariogeiger commented Jun 4, 2024

This can happen accidentally. It happened to me on the pytorch version of e3nn and that's why I did cache this code.

(oops closed by misclick)

@mariogeiger mariogeiger closed this Jun 4, 2024
@mariogeiger mariogeiger reopened this Jun 4, 2024
@mitkotak
Copy link
Member

mitkotak commented Jun 20, 2024

I am also getting stalled by the CG computing code (one of the asserts gets angry) when running lmax > 11.

My work around is by explicitly calling coefficients cached in an .npz file. I am calling and creating the file using BasisLib. (dependencies are Numpy and friends).

Interface looks something like this.

e3nn_jax.clebsch_gordan_basislib(l1, l2, l3)

Can put in PR if it makes sense ? @mariogeiger

I think for lower Ls the latency of calling from the disk might be more than just computing it on the fly as its done currently so maybe having some kind of if condition that calls this for higher L ?

The same machinery can be used for spherical harmonics as well so can also file a PR there.

@mariogeiger
Copy link
Member

Either we completely switch to BasisLib or we keep it local. Why having two different implementations of the same thing?

@mariogeiger
Copy link
Member

Btw I'm waiting for the copy safe cache and we can already merge this

@mitkotak
Copy link
Member

@mariogeiger Is this what you meant ? mitkotak#1

@mariogeiger
Copy link
Member

I mean what I said in my fist comment. We need to systematically copy the cached data to avoid it getting modified inplace by accident. This causes bug what are extremely hard to debug.

@mitkotak
Copy link
Member

mitkotak commented Jun 23, 2024

@mariogeiger Sorry seems like my push did not update Github before. I have the safe copy thing that you were requesting. Can you have a look again at the PR ?

@mariogeiger
Copy link
Member

I don't see any changes

@mitkotak
Copy link
Member

mitkotak commented Jun 23, 2024

Sorry for the confusion. Can you see this ?

@mariogeiger
Copy link
Member

This this is a PR in your own fork repo

@mitkotak
Copy link
Member

Did not have write permissions to this one

@mariogeiger
Copy link
Member

I guess it's because we are on @olivier-peltre own branch.

Maybe you can create a new PR but this time select this repo as a target, not your own fork.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants