Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement to_vec for matrix factorizations #128

Merged
merged 5 commits into from
Jul 29, 2021

Conversation

Kolaru
Copy link
Contributor

@Kolaru Kolaru commented Dec 25, 2020

Fixes #114 and should help cleaning the tests in JuliaDiff/ChainRules.jl#306.

@willtebbutt
Copy link
Member

LGTM. Could you please bump the patch version?

@willtebbutt
Copy link
Member

I'm happy with this, but could someone else quickly take a look to make sure I've not missed anything obvious. @oxinabox @nickrobinson251 @wesselb ?

@oxinabox
Copy link
Member

I am on leave.
If noone has looked at it before the 4th of Jan, I will look at it that week.

Copy link
Member

@wesselb wesselb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks for adding this. :) I've left two small comments.

.gitignore Outdated Show resolved Hide resolved
test/to_vec.jl Outdated Show resolved Hide resolved
end
return x_vec, QRCompact_from_vec
end

# Non-array data structures
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x.T consists of upper triangular blocks: JuliaLang/julia#41363 (comment)
The lower part is undefined on purpose, NaNs can hide there. This line won't work:
x_vec, back = to_vec([x.factors, x.T])

For application purposes it may be preferable to think of derivatives in Q and R, not in factors and T.
So to_vec would unroll Q and R into a matrix. To instantiate a new QR object from updated values, one would need to re-calculate the QR factorization:

function FiniteDifferences.to_vec(x::S) where {S <: Union{LinearAlgebra.QRCompactWYQ, LinearAlgebra.QRCompactWY}}
    x_vec, x_back = to_vec([x.Q x.R])
    function QRCompact_from_vec(v)
        Q_new, R_new = x_back(v)
        QR = S(Q_new * R_new)
        return S(QR.factors, QR.T)
    end
    return x_vec, QRCompact_from_vec
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that but unfortunately recomputing the QR factorization is enough to introduce numerical errors, and make the to_vec test fail (since it test for exact equality).

As far as I understand the internal representation used by to_vec should not change the result and it should be fine. If it is not the case, I will need to be smarter, somehow.

Copy link
Member

@simeonschaub simeonschaub Jul 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand the internal representation used by to_vec should not change the result and it should be fine. If it is not the case, I will need to be smarter, somehow.

If it contains NaNs, it definitely does. This should be reusing the logic added in JuliaLang/julia#41363, in particular _triuppers_qr, that way no numerical instabilities will be introduced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure to understand. The numerical instability I am referring to are those that prevent are introduced by recomputing the QR decomposition internally:

julia> X = rand(10, 10) ;

julia> F = qr(X) ;

julia> Xnew = F.Q * F.R ;

julia> Xnew == X
false

julia> Fnew = qr(Xnew) ;

julia> Fnew == F
false

julia> Fnew.Q == F.Q
false

The tests require from_vec(to_vec(F)) == F which is not guaranteed there.

To deal with NaNs in T is it preferrable to change the supported version of Julia and use LinearAlgebra._triuppers_qr or to reproduce here part of the logic?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just meant that by filtering out the undef values, we get both the correct behavior and avoid any numerical issues related to having to refactorize the recalculated matrix.

To deal with NaNs in T is it preferrable to change the supported version of Julia and use LinearAlgebra._triuppers_qr or to reproduce here part of the logic?

It's probably easiest if we just copy that code from there for now. Would be good to mention in a comment that this is taken from Base though and that it can be removed once we drop support for older versions of Julia.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, a good test here would be to check that to_vec(qr(A)) == to_vec(qr(A))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I incorporated that, and added a test of NaN in test_to_vec along the way.

QR must unfortunaely be tested separately for now, because it compares the T matrices that have those arbitrary values.

@rkube
Copy link

rkube commented Jul 8, 2021

What is the status of this PR? I see that some checks are failing but the logs have expired. Do you need help with this?

@oxinabox oxinabox closed this Jul 8, 2021
@oxinabox oxinabox reopened this Jul 8, 2021
@oxinabox
Copy link
Member

oxinabox commented Jul 8, 2021

What is the status of this PR?

It is waiting for @Kolaru 's to replay to Wessel's comments.
It seems it's dropped off their radar.

It is reasonable to make another PR to branching off there branch, then rebase and fix the tests, and the comments, and we would be able to merge it.
(They would still show up in the contributors since would keep there commit, if anyone is keeping score.)

@Kolaru
Copy link
Contributor Author

Kolaru commented Jul 13, 2021

Indeed it kind of went off my radar. I think it may have been partially superseeded by #156. I will try to come back to it ASAP.

@Kolaru
Copy link
Contributor Author

Kolaru commented Jul 20, 2021

This should be ready now.

Copy link
Member

@simeonschaub simeonschaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The QRCompactWY case is still broken, see my comment above.

@@ -162,11 +162,24 @@ function to_vec(x::Cholesky)
return x_vec, Cholesky_from_vec
end

function to_vec(x::S) where {S <: Union{LinearAlgebra.QRCompactWYQ, LinearAlgebra.QRCompactWY}}
x_vec, back = to_vec([x.factors, x.T])
function to_vec(x::S) where {U, S <: Union{LinearAlgebra.QRCompactWYQ{U}, LinearAlgebra.QRCompactWY{U}}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure this is valid for QRCompactWYQ? If it is, it should at least be tested.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QRCompactWYQ is represented with the same T and factors matrices than QRCompactWY, yes. I added the tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect, thanks!

factors, T = back(v)
return S(factors, T)
factors, Tback = back(v)
return S(factors, Tback)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that using S like this here will not be type stable. Since it's not super performance critical, it might not be worth all the trouble though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, why is that so?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test/to_vec.jl Outdated
Comment on lines 139 to 140
@test F.Q == F.Q
@test F.R == F.R
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this test? Would be good to also add the to_vec(qr(A)) == to_vec(qr(A)) test I proposed above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was an oversight from me, it should read F_back.Q == F.Q, to test that back(to_vec(qr(A))) correctly gives back the input.

Your suggested test was on the next line, I have reformulated it to make more explicit what is tested.

test/to_vec.jl Outdated Show resolved Hide resolved
Copy link
Member

@simeonschaub simeonschaub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more comment, otherwise looks good to me. Thanks for pulling through!

test/to_vec.jl Outdated Show resolved Hide resolved
Co-authored-by: Simeon Schaub <[email protected]>
@Kolaru
Copy link
Contributor Author

Kolaru commented Jul 29, 2021

I integrated the last suggestion, that should indeed be enough to cover all cases.

@simeonschaub simeonschaub merged commit 6ee273e into JuliaDiff:master Jul 29, 2021
@simeonschaub
Copy link
Member

Cool, thanks for the contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Stackoverflow with QR
6 participants