Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: back nanoevents.methods.vector with scikit-hep vector #991

Merged
merged 68 commits into from
Aug 8, 2024

Conversation

lgray
Copy link
Collaborator

@lgray lgray commented Jan 14, 2024

Fixes #992

Right now this is breaking in terms of some low-level definitions of physics variables, in particular:

  • rho is now always transverse, as opposed to it being previously r, this can certainly screw up math if you're not expecting it

Breaking interface differences (as you can see in tests):

  • we're now beholden to vector's much more strict typing of vectors you have to use to_VectorND() to correctly compare vectors.

But that doesn't seem to have too many side effects?

However, all the tests pass and the current implementation is backwards compatible in terms of delta_phi vs deltaphi and delta_r vs. deltaR so we can actually deprecate that in a reasonable way.

Another thing we need to address is where the functions nearest and metric_table should live, which are fairly often used but could be advertised better.

@nsmith- This will require a thorough review as well as contribution from you, and we should discuss how we want to roll this out.

@lgray lgray marked this pull request as draft January 14, 2024 20:12
@lgray lgray force-pushed the use_scikithep_vector branch 3 times, most recently from 2ddea28 to 43e1f19 Compare January 16, 2024 23:13
Copy link
Member

@nsmith- nsmith- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the only painful part is r vs. rho. Otherwise looks like it's drag and drop. @jrueb might want to review this as well.

src/coffea/nanoevents/methods/vector.py Outdated Show resolved Hide resolved
0.6078019961139605,
]

computed_dphi = a.delta_phi(b).to_numpy()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the explicit cast required now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see the comment on approx.

tests/test_nanoevents_vector.py Show resolved Hide resolved
tests/test_nanoevents_vector.py Outdated Show resolved Hide resolved
src/coffea/nanoevents/methods/vector.py Outdated Show resolved Hide resolved
@lgray
Copy link
Collaborator Author

lgray commented Jan 18, 2024

unfortunately @jrueb has left academia for industry, unless he wants to review out of the kindness of his heart.

@lgray
Copy link
Collaborator Author

lgray commented Jan 18, 2024

I'm putting in a deprecation for ~6 months from now over in #997.

@lgray lgray force-pushed the use_scikithep_vector branch 4 times, most recently from 9ca78b9 to d7ad56c Compare January 20, 2024 03:56
@ikrommyd
Copy link
Collaborator

@lgray Is the plan to keep the public delta_phi and delta_r functions as is? I suggest you do because there are some cases where a user defined "eta" instead of the .eta attribute is needed

@lgray
Copy link
Collaborator Author

lgray commented Jan 21, 2024

In the first phase of this, yes we'll keep those functions for backwards compatibility.

In the long run, when we remove nanoevents.methods.vector, vector supplies deltaR and deltaphi and folks should migrate to use what is already supplied. You can see we now just use those functions within their underscore equivalents.

@lgray lgray force-pushed the use_scikithep_vector branch 7 times, most recently from 4a7ef1e to 2374127 Compare January 26, 2024 22:18
@lgray
Copy link
Collaborator Author

lgray commented Jan 26, 2024

@nsmith- tests fail after appeasing linter :-(

@nsmith-
Copy link
Member

nsmith- commented Jan 26, 2024

Yes, I added tests and am now working to improve them

@lgray
Copy link
Collaborator Author

lgray commented Jan 26, 2024

Ah, gotcha - yeah putting this together with scikit-hep/vector#413 I figured you may be messing around.

@lgray lgray force-pushed the use_scikithep_vector branch 3 times, most recently from 8946d59 to bbcf978 Compare January 30, 2024 15:23
@lgray
Copy link
Collaborator Author

lgray commented Jan 30, 2024

I think one place where we could benefit a lot from a thorough look-over by @Saransh-cpp is to make sure we're not unintentionally clobbering any of the vector interface with parts of nanoevents.vector, and likewise @nsmith- if there's anything we can do to make the delta between skh-vector-backed nanoevents.vector closer to the original implementation we should try to smooth all that out.

Saransh-cpp and others added 3 commits July 19, 2024 11:44
* More tests

* Don't modify the global awkward behavior
@lgray
Copy link
Collaborator Author

lgray commented Aug 4, 2024

@Saransh-cpp looks like a few things broke... there are missing behavior functions according to the tests.

@Saransh-cpp
Copy link
Member

Weirdly, I am not able to reproduce any of the failures locally. I guess that's why I gave this PR a green flag in the last meeting.

@Saransh-cpp
Copy link
Member

Saransh-cpp commented Aug 5, 2024

I am having no luck in reproducing these locally, even with new environments. @lgray could you or someone else check if this is reproducible locally?

The GH Actions error looks unrelated (something wrong with pytest's installation in main).

@lgray
Copy link
Collaborator Author

lgray commented Aug 5, 2024

I'll give it a spin soon here to see if I can reproduce.

@lgray
Copy link
Collaborator Author

lgray commented Aug 5, 2024

@Saransh-cpp so this appears to be some windows-specific issues with paths in new images github produced.

There's still another error, and we'll see if that shows up without the changes in this PR.

@lgray
Copy link
Collaborator Author

lgray commented Aug 5, 2024

@Saransh-cpp unfortunately it appears to be something with this PR. The tests with just awkward 2.6.7 over in #1149 are passing fine.

@lgray
Copy link
Collaborator Author

lgray commented Aug 6, 2024

@Saransh-cpp any luck?

@Saransh-cpp
Copy link
Member

Hi @lgray, I tried the tests in a fresh environment as well as on another system (windows), but I really don't know why I am not seeing the errors locally. I am a bit clueless here.

@lgray
Copy link
Collaborator Author

lgray commented Aug 8, 2024

@Saransh-cpp so I reproduced this on my laptop:

>>> events.GenPart.parent
Traceback (most recent call last):
  File "/Users/lgray/miniforge3/envs/coffea-dev/lib/python3.12/site-packages/dask_awkward/lib/core.py", line 1578, in __getattr__
    cls_method = getattr_static(self._meta, attr)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/lgray/miniforge3/envs/coffea-dev/lib/python3.12/inspect.py", line 1882, in getattr_static
    raise AttributeError(attr)
AttributeError: parent

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/lgray/miniforge3/envs/coffea-dev/lib/python3.12/site-packages/dask_awkward/lib/core.py", line 1580, in __getattr__
    raise AttributeError(f"{attr} not in fields.")
AttributeError: parent not in fields.
>>> events.GenPart._meta
<PtEtaPhiMLorentzVectorArray-typetracer [...] type='## * var * GenParticle[...'>

So somehow we are resolving to the wrong type! It should be a GenParticleArray-typetracer.

@jpivarski

@lgray
Copy link
Collaborator Author

lgray commented Aug 8, 2024

The layout seems to be correct though:

<ListOffsetArray len='##'>
    <offsets><Index dtype='int64' len='##'>
        [## ... ##]
    </Index></offsets>
    <content><RecordArray is_tuple='false' len='##'>
        <parameter name='__record__'>'GenParticle'</parameter>
        <parameter name='__doc__'>'interesting gen particles '</parameter>
        <parameter name='collection_name'>'GenPart'</parameter>

@Saransh-cpp
Copy link
Member

Hi @lgray. I was able to reproduce this on the third system 😬

I think I know why this is happening. Working on it right now.

@lgray
Copy link
Collaborator Author

lgray commented Aug 8, 2024

@nsmith- finally passing tests except for what looks like a spurious failure on arm linux. I've re-requested your review some time ago, but now it looks like you can properly do one. I think most of what you asked last time has been answered.

@Saransh-cpp
Copy link
Member

Two minor missing suggestions from the review, which are more of discussion topics (maybe in tomorrow's meeting) -

Copy link
Member

@nsmith- nsmith- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@@ -519,22 +534,36 @@ def test_inherited_method_transpose(lcoord, threecoord, twocoord):
)
elif twocoord == "PolarTwoVector":
c = ak.zip(
{"r": [-10.0, 13.0, 15.0], "phi": [1.22, -1.0, 1.0]},
{"rho": [-10.0, 13.0, 15.0], "phi": [1.22, -1.0, 1.0]},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This renaming of r to rho is something we'll need to remember to include in the release notes.
Perhaps a migration guide is in order? Are there any other user-facing changes we'll have to advertise?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This the only one that isn't resolved by reading vector's documentation.

I think this one will be a fairly low hit-rate, since most people would specify a polar two vector using pT and phi, which is the same in both cases.

@lgray lgray merged commit 37dfe81 into master Aug 8, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use scikit-hep vector to as the backend for vector math in nanoevents
5 participants