Add the GTPSA.jl backend #329

mattsignorelli · 2024-06-24T16:48:14Z

gdalle · 2024-06-25T06:20:25Z

@mattsignorelli thank you for the contribution!
I fixed a few things and defined AutoGTPSA in the main package for now. Types should not be defined in package extensions because it makes them hard to import.
The main part missing is the pushforward operator. If you don't have a dedicated mechanism for that, you can always deduce it from derivative, gradient or jacobian depending on the function type (array/scalar -> array/scalar). It will be slow but it will work.

codecov-commenter · 2024-06-25T06:21:31Z

Codecov Report

Attention: Patch coverage is 0% with 416 lines in your changes missing coverage. Please review.

Project coverage is 54.75%. Comparing base (d51fc0a) to head (77e49d6).

Files with missing lines	Patch %	Lines
...ace/ext/DifferentiationInterfaceGTPSAExt/onearg.jl	0.00%	287 Missing ⚠️
...ace/ext/DifferentiationInterfaceGTPSAExt/twoarg.jl	0.00%	119 Missing ⚠️
...face/ext/DifferentiationInterfaceGTPSAExt/utils.jl	0.00%	9 Missing ⚠️
...erfaceGTPSAExt/DifferentiationInterfaceGTPSAExt.jl	0.00%	1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (d51fc0a) and HEAD (77e49d6). Click for more details.

HEAD has 38 uploads less than BASE

Flag BASE (d51fc0a) HEAD (77e49d6)

DI 41 11

DIT 10 2

Additional details and impacted files

@@             Coverage Diff             @@
##             main     #329       +/-   ##
===========================================
- Coverage   98.51%   54.75%   -43.76%     
===========================================
  Files         108      103        -5     
  Lines        4297     4597      +300     
===========================================
- Hits         4233     2517     -1716     
- Misses         64     2080     +2016

Flag	Coverage Δ
DI	`52.44% <0.00%> (-46.14%)`	⬇️
DIT	`60.01% <ø> (-38.37%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…nterface.jl into pr/mattsignorelli/329

mattsignorelli · 2024-08-06T15:51:57Z

Sorry for the delay on this, I've had my hands tied with all different projects. Two weeks ago I pushed major updates/fixes and finishing touches to GTPSA.jl, which is now it's in first stable release. I'll get back to this shortly

gdalle · 2024-08-07T06:51:06Z

The structure of the tests has changed a little since the last time you checked, can you merge main into this PR?

latest main

gdalle · 2024-08-08T13:55:52Z

Let me just take the time to review here, I haven't yet looked at your code. If all is well on the DI front, we can add AutoGTPSA to ADTypes!
Thank you so much for your contribution

gdalle · 2024-08-09T13:48:54Z

@mattsignorelli before I review, I updated the tests so that we now have full code coverage information. Can you take a look and see if you are able to increase the coverage?

One thing you need to know is that in-place operators like pushforward!(f, dy, backend, x, dx) are not tested on scenarios for which the target dy is not mutable. So the branch if y isa Number in this operator is not needed.

Another thing you may want to exploit to reduce code duplication is the redundancy between operators: derivative is just pushforward with dx = 1, gradient is just pullback with dy = 1. They all fall back onto one another inside DI, so that it is theoretically sufficient to just implement value_and_pushforward. If custom implementations of other operators yield a speed boost, then let's keep them, but if all you do in derivative is essentially call pushforward, you can get rid of it (for example).

mattsignorelli · 2024-08-09T19:32:15Z

Thanks for looking this over!

For second order, because GTPSA does not allow nesting, separate operators must be written. For the first order operators gradient, jacobian, there is a large slowdown/significantly higher number of allocations when using only pushforward. This is because, in order to construct the jacobian or gradient, several pushforwards with different seeds have to be propagated through the function. The TPS type is mutable, and so it is much slower to repeatedly evaluate the function for different seeds, than it is to evaluate it once treating the TPSs as variables and then extract the first order derivatives. However derivative I have removed as using pushforward for that is equivalent.

I have also removed inaccessible branches (where y is not mutable) for the in-place operators, following the code coverage report, and simplified the HVP code (since the best GTPSA can do really is construct the enter hessian and multiply the vector).

The rest of the uncovered code is because the user can optionally specify a custom Descriptor, which lives in the AutoGTPSA type - this allows users to choose custom truncation orders for different variables (e.g. calculate hessian of f(x,y) but assume no $x^2$ terms in the resulting truncated power series so $\partial^2 f/\partial x^2$ is zero). This can significantly improve speed at the cost of lower accuracy. This feature is only applicable in highly specific use cases and I don't expect it to be used very often, but it should be available since GTPSA provides it. As for the code coverage in this case, the number of input variables and truncation order(s) have to be specified in the Descriptor for each tested function, and I'm not sure how to do this for every test case?

gdalle · 2024-09-18T08:42:33Z

Sorry for taking so long to get back to you @mattsignorelli! I have brought your PR up to speed with the rest of the package, and taken into account the new version of ADTypes which includes AutoGTPSA.
The last thing I'd like to do before merging is clean up the code a little bit, by removing duplication in favor of clear auxiliary functions. For example, my last commit cuts a lot of code from pushforward thanks to

function initialize!(xt::TPS, x::Number, dx::Number)
    xt[0] = x
    xt[1] = dx
    return xt
end

function initialize!(xt::AbstractArray{<:TPS}, x::AbstractArray, dx::AbstractArray)
    for i in eachindex(xt, x, dx)
        initialize!(xt[i], x[i], dx[i])
    end
    return xt
end

Can you maybe create similar functions to shorten the other operators? Just like TPS initialization, extraction from TPS into numbers and arrays could also be streamlined.

mattsignorelli · 2024-09-18T13:46:42Z

Thanks for your work on this!

function initialize!(xt::TPS, x::Number, dx::Number)
    xt[0] = x
    xt[1] = dx
    return xt
end

I think we need to be careful here. This would only apply for single-variable functions: xt[1] = dx basically sets the dual part of the first variable to dx. However, if there are two variables, then to set the dual part of the second variable, you would need to do xt[2] = dx. E.g.

julia> d = Descriptor(2,1); # two variables to first order

julia> f = TPS(use=d);

julia> f[0] = 1; f[1] = 2; f[2] = 3;

julia> print(f) # corresponds to f(dx1,dx2) = 1 + dx1 + dx2
TPS64:
 Coefficient                Order   Exponent
  1.0000000000000000e+00      0      0   0
  2.0000000000000000e+00      1      1   0
  3.0000000000000000e+00      1      0   1

Also, because the TPS type is mutable, we really only want to evaluate the function once. I'll make some changes to the updates and try to simplify/remove code duplication using utility functions, and push for review

mattsignorelli · 2024-09-18T14:06:26Z

I think I understand what Tangents are now. Just to check: to take the pushforward of a 3 variable function with seed [dx1, dx2, dx3], your Tangents object will be Tangents{1, Vector{Float64}}(([dx1, dx2, dx3])) ? And so Tangents is there to store multiple seeds?

gdalle · 2024-09-18T14:12:45Z

Tangents is there to store multiple seeds?

The expression "multiple seeds" is a bit unclear. For instance, for a function f(x::AbstractVector), DI considers dx::AbstractVector to be "a single seed". It used to be possible to call pushforward(f, backend, x, dx) but that is no longer the case.
From v0.6 you will need to use the wrapper pushforward(f, backend, x, Tangents(dx)), and recover a Tangents{1} as the result. On the other hand, if you have several seeds dxa, dxb, dxc (perhaps clearer than numbers?) inside Tangents(dxa, dxb, dxc), then your batch size is 3 and you will get a Tangents{3} as the result.
Sorry if this is not well documented, you're the first outside person to ever contribute an extension so I didn't plan for that case ^^ But if there are unclear aspects I should definitely improve the docs.

gdalle · 2024-09-18T14:14:09Z

I think we need to be careful here. This would only apply for single-variable functions

That's the setting considered in DI. Even with all the latest changes, there will only ever be one active variable x, either in f(x) or in f!(y, x). Of course that variable can itself be an array.

mattsignorelli · 2024-09-18T17:25:44Z

Ah ok, I think I understand better. I'm admittedly not super familiar with the usual conventions used by most AD packages in Julia, only that in GTPSA, so I'm trying to connect the dots here. Let me explain how it works in GTPSA, and then maybe you can help better see how to fit it into DI:

Basically with GTPSA, it calculates truncated power series in the variables, assuming the variables are small. So for example to calculate a Jacobian of a function $F: \mathbb{R}^n -> \mathbb{R}^m$, you would define a Descriptor(n, 1) for $n$ variables to order 1, and then use x = vars(d) which gives you a vector which in the usual AD language is actually $[\Delta x_1, \Delta x_2, ... \Delta x_n]$ ("variables" in GTPSA are small, so actually this might be better understood as dx = vars(d)). Then, to calculate the multivariable Taylor expansion of the function $F$ around some point $\vec{a}$, you would just do y = F(a.+x). y here is now the multivariable Taylor expansion. You can then extract the Jacobian from the monomial coefficients of the Taylor expansion.

Alternatively, one could do this one with variable, e.g. calculate $F([a_1+\Delta x_1, a_2, ..., a_n])$, then do $F([a_1, a_2+\Delta x_1, ..., a_n])$ etc and extract the first order part of the TPS result for each component for each calculation. However, this will be slower than instead propagating all of $F(\vec{a}+[\Delta x_1, \Delta x_2, ... \Delta x_n])$, because TPSs are mutable and we only want to calculate the function once.

gdalle · 2024-09-19T07:20:55Z

That makes sense, thank you. But then I think the initialize! function I wrote earlier is fine? At least it passes the tests ^^
Also, this makes me wonder if we could compute HVPs by considering both x and dx as variables somehow. With higher-level differentiation utilities like those in GTPSA, there must be something better to do than computing the full Hessian first.

mattsignorelli added 4 commits June 17, 2024 19:15

add extension directory

5a69e77

derivative,second derivative, gradient

aeedb5d

jacobian, hessian

9572d57

add GTPSA to Project.toml, move AutoGTPSA to ADTypes

6e69e6c

mattsignorelli mentioned this pull request Jun 24, 2024

Adding GTPSA.jl to the interface #316

Open

gdalle added 2 commits June 25, 2024 08:18

Fix tests

a39fd4e

Merge remote-tracking branch 'origin' into pr/mattsignorelli/329

a2df56f

gdalle linked an issue Jun 25, 2024 that may be closed by this pull request

Adding GTPSA.jl to the interface #316

Open

gdalle added the backend Related to one or more autodiff backends label Jun 25, 2024

gdalle marked this pull request as draft June 25, 2024 09:05

gdalle and others added 5 commits June 25, 2024 11:06

Lighter workflow

79d6ee1

pushforward

cee4b7e

trigger workflow

b362dd9

fix tests?

1473858

fix typo

20c026a

gdalle marked this pull request as ready for review July 8, 2024 04:41

gdalle added 2 commits July 15, 2024 10:24

Merge branch 'main' into pr/mattsignorelli/329

29029b3

Merge branch 'addgtpsa' of github.com:mattsignorelli/DifferentiationI…

23ce9fd

…nterface.jl into pr/mattsignorelli/329

gdalle marked this pull request as draft July 30, 2024 14:05

mattsignorelli added 2 commits August 6, 2024 21:00

hvp done and two args done

45012c7

fix typo

3afbbb9

mattsignorelli and others added 5 commits August 7, 2024 09:56

Merge branch 'addgtpsa' into main

9943008

Merge pull request #1 from mattsignorelli/main

1493573

latest main

Update Test.yml

3039528

no code coverage

ecee93e

add GTPSA test

f0a0d79

gdalle added 3 commits August 8, 2024 17:13

Reactivate coverage

5f4c34c

Reactivate all tests

037df40

Test group

c7e1382

simplified code and removed derivative

65d5a4e

gdalle mentioned this pull request Aug 10, 2024

Leverage function_annotation for AutoEnzyme #407

Merged

Merge remote-tracking branch 'origin' into pr/mattsignorelli/329

1819835

gdalle added this to the v0.6 milestone Sep 14, 2024

gdalle added 6 commits September 17, 2024 12:55

Merge remote-tracking branch 'origin/main' into pr/mattsignorelli/329

cbf0c49

Adapt to new Tangents

8a3a239

Tangents import

f663484

Fix da fixes

a388cfc

Merge remote-tracking branch 'origin/main' into pr/mattsignorelli/329

ef77111

Remove temporary backend

bd70675

gdalle marked this pull request as draft September 18, 2024 08:37

Fixes

77e49d6

Docs

dacf84d

gdalle closed this Sep 19, 2024

gdalle reopened this Sep 19, 2024

gdalle removed this from the v0.6 milestone Sep 23, 2024

gdalle mentioned this pull request Sep 25, 2024

Adding TaylorDiff.jl #504

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the GTPSA.jl backend #329

Add the GTPSA.jl backend #329

mattsignorelli commented Jun 24, 2024

gdalle commented Jun 25, 2024

codecov-commenter commented Jun 25, 2024 •

edited

Loading

mattsignorelli commented Aug 6, 2024

gdalle commented Aug 7, 2024

gdalle commented Aug 8, 2024

gdalle commented Aug 9, 2024

mattsignorelli commented Aug 9, 2024 •

edited

Loading

gdalle commented Sep 18, 2024

mattsignorelli commented Sep 18, 2024 •

edited

Loading

mattsignorelli commented Sep 18, 2024

gdalle commented Sep 18, 2024 •

edited

Loading

gdalle commented Sep 18, 2024

mattsignorelli commented Sep 18, 2024

gdalle commented Sep 19, 2024 •

edited

Loading

Add the GTPSA.jl backend #329

Are you sure you want to change the base?

Add the GTPSA.jl backend #329

Conversation

mattsignorelli commented Jun 24, 2024

gdalle commented Jun 25, 2024

codecov-commenter commented Jun 25, 2024 • edited Loading

Codecov Report

mattsignorelli commented Aug 6, 2024

gdalle commented Aug 7, 2024

gdalle commented Aug 8, 2024

gdalle commented Aug 9, 2024

mattsignorelli commented Aug 9, 2024 • edited Loading

gdalle commented Sep 18, 2024

mattsignorelli commented Sep 18, 2024 • edited Loading

mattsignorelli commented Sep 18, 2024

gdalle commented Sep 18, 2024 • edited Loading

gdalle commented Sep 18, 2024

mattsignorelli commented Sep 18, 2024

gdalle commented Sep 19, 2024 • edited Loading

codecov-commenter commented Jun 25, 2024 •

edited

Loading

mattsignorelli commented Aug 9, 2024 •

edited

Loading

mattsignorelli commented Sep 18, 2024 •

edited

Loading

gdalle commented Sep 18, 2024 •

edited

Loading

gdalle commented Sep 19, 2024 •

edited

Loading