Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to deal with matrix input #47

Merged
merged 9 commits into from
Sep 5, 2023
Merged

Ability to deal with matrix input #47

merged 9 commits into from
Sep 5, 2023

Conversation

mBarreau
Copy link
Contributor

This PR aims at adding the possibility to deal with matrix input since this can be helpful in the context of PINN. Consequently, here are the main modifications:

  • Use mapcols in derivative with a matrix input
  • Enable different types between x and l in derivative
  • Add tests.

@tansongchen
Copy link
Member

Could you explain mathematically what you are trying to do here? We might have more general solution to this...

@mBarreau
Copy link
Contributor Author

mBarreau commented Aug 28, 2023

@tansongchen
Sure, what I want to do is simple.
Let
CodeCogsEqn
Then
CodeCogsEqn (1)

Is there a reason for the types in derivative for x and l to be the same? If both are independent subclasses of AbstractVector{T} that would allow for more freedom in the syntax :)

@codecov
Copy link

codecov bot commented Aug 28, 2023

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.29% 🎉

Comparison is base (7979d76) 85.18% compared to head (6ad719f) 85.48%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #47      +/-   ##
==========================================
+ Coverage   85.18%   85.48%   +0.29%     
==========================================
  Files           6        6              
  Lines         243      248       +5     
==========================================
+ Hits          207      212       +5     
  Misses         36       36              
Files Changed Coverage Δ
src/derivative.jl 100.00% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mBarreau
Copy link
Contributor Author

@tansongchen , tests pass now.
Is that fine with you?

@tansongchen
Copy link
Member

Thanks for contributing that, but I'm still preparing for an exam these days 😂 I will get back to you and take a closer look on Friday or this weekend!

@tansongchen
Copy link
Member

tansongchen commented Sep 1, 2023

Let me try to understand the point of these additional differentiation APIs.

Currently, there are two methods, derivative(f, x0, order) calculates the higher-order derivative

$$ \frac{\mathrm d^kf}{\mathrm dx^k}\big |_{x_0} $$

and derivative(f, x0, l, order) calculates the higher-order directional derivative in direction l

$$ \frac{\partial^kf}{\partial l^k}\big |_{x_0} $$

And now, you add two additional methods, which say that,

  1. For a 1-by-N matrix input x, the function should calculate the derivative at each of its components, and then assemble the output back to a 1-by-N matrix;
  2. For a M-by-N matrix input x, and a M-sized vector l, the function should calculate the directional derivative at each of its columns, and then assemble the output back to a 1-by-N matrix;

Is this correct? If is, I'm happy with this kind of shorthand notations, as long as they proved handful in PINN applications. But I would prefer to not use Union types and move the new APIs to a new block, as well as add some comments stating that they are just shorthands for multiple calculations, or they might be confused with matrix derivatives (see https://en.wikipedia.org/wiki/Matrix_calculus ). If you agree on that, I will take care of moving the code and adding comments, and then merge.

@mBarreau
Copy link
Contributor Author

mBarreau commented Sep 1, 2023

Hi,

First of all, you are totally correct with what I aim to do.

Let me justify it. You define the Flux/Lux model and you apply it to the input, and then you associate it with the targets and build your loss function.
The idea is to do the same with the physics residual. Since the rand function output a 1xN matrix, it is very convenient to define a residual model which behaves as the original model (input nXN and output MxN) such that you can build the loss in the exact same way.
This is then even simpler to resample or build more complex loss.

If you agree, then I would definitely support such an idea :) (and even write a small tutorial to show how easy it gets to write complex pinn using taylordiff).

@tansongchen
Copy link
Member

Just did some cleaning up work and added some comments! Once CI passes I will merge. Thanks again for contribution

@mBarreau
Copy link
Contributor Author

mBarreau commented Sep 4, 2023

@tansongchen, can I ask why you write
AbstractMatrix{T} where T <: Number
And not
AbstractMatrix{<:Number} ?
The second option looks simpler to read and shorter.

@tansongchen
Copy link
Member

They are equivalent when there is only one type parameter and the parameter is not used in function body. However, when there are two or more, explicit variable name would help to tell whether two types can be different or not. Also, in make_taylor function the type parameter is used for explicit conversion.

So to keep consistency with more complicated cases, I would personally prefer to write all type parameter as a variable :)

@tansongchen tansongchen merged commit f776b85 into JuliaDiff:main Sep 5, 2023
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants