Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade fuse_rot_angles to preserve derivatives and global phases #6031

Merged
merged 34 commits into from
Aug 14, 2024

Conversation

dwierichs
Copy link
Contributor

@dwierichs dwierichs commented Jul 23, 2024

Context:
The current implementation of fuse_rot_angles, which is used by merge_rotations and single_qubit_fusion, has the following issues:

  1. It does not necessarily preserve global phases. As we move towards global-phase aware standards, this becomes an issue where it wasn't one before.
  2. Its derivative is wrong, forming part (but not all) of the bug [BUG] Some decompositions/transforms do not preserve derivatives #5715. In particular, the custom handling of special input values prevents the calculation of correct derivatives, and fuse_rot_angles at singular points leads to wrong derivatives, rather than NaN values, which are mathematically well-motivated.
  3. A minor technical issue is that the implementation requires nested conditionals, leading to a good bit of code and separate handling of traced JAX code, making it more complex.

Description of the Change:
The implementation of fuse_rot_angles is remade entirely.

Benefits:
The remade code uses a comparably simple mathematical expression to compute the fused rotation angles that

  1. preserves global phases
  2. has the correct derivative everywhere except for well-understandable, predictable singular points. These singular points make sense because rotation fusion is not a smooth map everywhere. The predictability allowed us to write a dedicated test that confirms our understanding of the singularities, at least within a large set of special test points.
  3. does not require conditionals beyond those that are implemented in qml.math.arctan2 anyways, and thus available in all ML interfaces, including JAX with JIT-compatibility.
  4. As a bonus, the new implementation supports broadcasting/batching with an arbitrary number of leading dimensions if all angles in each set are broadcasted in the same way (because this is nice, easy to support, and allows us to speed up tests a lot).

In summary, the global phases are fixed, Jacobians are only ever NaNs, rather than wrong, and Jacobians are only NaNs when they were NaN or wrong in the current implementation.

Possible Drawbacks:
N/A

Related GitHub Issues:
#5715 (not fixed entirely, just partially)

Related Shortcut Stories:
[sc-63642]

Copy link

codecov bot commented Jul 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.65%. Comparing base (1a1d658) to head (21712fd).
Report is 305 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6031      +/-   ##
==========================================
- Coverage   99.66%   99.65%   -0.01%     
==========================================
  Files         431      431              
  Lines       41889    41572     -317     
==========================================
- Hits        41748    41430     -318     
- Misses        141      142       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@vincentmr vincentmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @dwierichs . I have a couple questions, but this looks ready to merge to me :)

doc/releases/changelog-dev.md Outdated Show resolved Hide resolved
pennylane/transforms/optimization/merge_rotations.py Outdated Show resolved Hide resolved
pennylane/transforms/optimization/single_qubit_fusion.py Outdated Show resolved Hide resolved
pennylane/transforms/optimization/single_qubit_fusion.py Outdated Show resolved Hide resolved
pennylane/transforms/optimization/single_qubit_fusion.py Outdated Show resolved Hide resolved
@dwierichs dwierichs requested a review from vincentmr July 25, 2024 20:49
@dwierichs dwierichs added the review-ready 👌 PRs which are ready for review by someone from the core team. label Jul 26, 2024
@dwierichs dwierichs requested a review from Qottmann July 26, 2024 09:03
Copy link
Contributor

@Qottmann Qottmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats @dwierichs , this looks like a lot of tedious work to get right. Really appreciate the detailed derivation and commentary for devs and users 👌 👌 👌

I dont have anything to complain, but will have to take a second look at tests before approving.

Co-authored-by: Korbinian Kottmann <[email protected]>
Copy link
Contributor

@vincentmr vincentmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice one @dwierichs !

Co-authored-by: Thomas R. Bromley <[email protected]>
Copy link
Contributor

@Qottmann Qottmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had a look at the tests now as well. Looks good so far, but aren't we missing tests for differentiating with autograd, torch and tf?

@dwierichs
Copy link
Contributor Author

aren't we missing tests for differentiating with autograd, torch and tf?

Yes, you're right. I was happy to include derivatives in the tests at all, and noticed they started taking quite some time already. But I can add the other interfaces as well 👍

@dwierichs
Copy link
Contributor Author

I'm working on extending the tests to other interfaces, and the tests start hitting O(10) seconds for this function.
Previously the coverage was much worse, but I'm not sure how many resources we want to throw at this?

@dwierichs
Copy link
Contributor Author

I think it is fine now for all interfaces, but I can't get TensorFlow to work with broadcasting and the non-broadcasted version takes ages. @Qottmann would it be fine to skip the TensorFlow interface 😬

@trbromley
Copy link
Contributor

I think it is fine now for all interfaces, but I can't get TensorFlow to work with broadcasting and the non-broadcasted version takes ages. @Qottmann would it be fine to skip the TensorFlow interface 😬

TensorFlow is always being a pain 😢

In general we're happy for some things to slip in terms of feature parity for TensorFlow. But we want to be avoiding giving wrong answers - would that be the case here?

@dwierichs
Copy link
Contributor Author

So I think the code should in principle work with full features also with Tensorflow, but I can't seem to make it work with TF's vectorized_map in the test. This is more of a feature disparity of my TensorFlow skills, rather than PennyLane's feature support :D

However, other interfaces like Autograd also need to iterate manually within the test, instead of using something like JAX's vmap, making them very slow.
In the end, the question is whether we want to pay more than a minute (!) of testing time to this single function, just to guarantee correctly valued derivatives at special points?

Copy link
Contributor

@Qottmann Qottmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome job @dwierichs !

That was an impressive upgrade in test coverage, which makes me almost wonder if this shooting over the top 😅 For now, happy with it as it covers all 👍

@dwierichs dwierichs enabled auto-merge (squash) August 14, 2024 17:56
@dwierichs dwierichs merged commit 98c8ed3 into master Aug 14, 2024
40 checks passed
@dwierichs dwierichs deleted the rot_math branch August 14, 2024 18:22
dwierichs added a commit that referenced this pull request Aug 19, 2024
**Context:**
As reported in #5715, `merge_rotations` and `single_qubit_fusion` have
problems with differentiability at specific points.
#6031 takes care of upgrading the Rot-angle fusion to only ever return
NaNs at mathematically non-differentiable points, rather than wrong
results.
However, the transforms add additional points, based on internal
optimizations, where the derivative is flawed.

**Description of the Change:**
This PR fixes the flawed derivatives caused by the code of the
transforms themselves.

**Benefits:**
Fixes derivatives of `merge_rotations` and `single_qubit_fusion` (where
mathematically defined)

**Possible Drawbacks:**
N/A

**Related GitHub Issues:**
Fixes another part of #5715, still not all of it.

---------

Co-authored-by: Vincent Michaud-Rioux <[email protected]>
Co-authored-by: Korbinian Kottmann <[email protected]>
Co-authored-by: Thomas R. Bromley <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review-ready 👌 PRs which are ready for review by someone from the core team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants