Update `trust_region_step_exact_` solvers #1363

YigitElma · 2024-11-13T07:06:39Z

Updates the initial guess for $\alpha$ to be 0 instead of 0.001* $\alpha$ _upper. I found out that the solution of $\alpha$ is usually pretty small. This reduces the total amount of QR's in the while loop.
Removes the last QR outside the while loop of trust_region_step_exact_qr. The loop condition already satisfies the step norm to be around trust radius.

Misc

Adds execute_on_cpu flag to desc.examples.get

github-actions · 2024-11-13T08:18:31Z

|             benchmark_name             |         dt(%)          |         dt(s)          |        t_new(s)        |        t_old(s)        | 
| -------------------------------------- | ---------------------- | ---------------------- | ---------------------- | ---------------------- |
 test_build_transform_fft_lowres         |     +0.40 +/- 4.70     | +2.13e-03 +/- 2.49e-02 |  5.33e-01 +/- 2.2e-02  |  5.31e-01 +/- 1.1e-02  |
 test_equilibrium_init_medres            |     -0.00 +/- 1.82     | -2.87e-05 +/- 7.53e-02 |  4.14e+00 +/- 6.4e-02  |  4.14e+00 +/- 4.0e-02  |
 test_equilibrium_init_highres           |     +0.69 +/- 1.06     | +3.73e-02 +/- 5.73e-02 |  5.46e+00 +/- 3.9e-02  |  5.42e+00 +/- 4.2e-02  |
 test_objective_compile_dshape_current   |     +0.11 +/- 1.31     | +4.34e-03 +/- 5.16e-02 |  3.95e+00 +/- 3.4e-02  |  3.94e+00 +/- 3.9e-02  |
 test_objective_compute_dshape_current   |     +0.11 +/- 1.31     | +5.55e-06 +/- 6.72e-05 |  5.13e-03 +/- 4.2e-05  |  5.13e-03 +/- 5.2e-05  |
 test_objective_jac_dshape_current       |     -2.76 +/- 5.23     | -1.21e-03 +/- 2.28e-03 |  4.24e-02 +/- 1.3e-03  |  4.36e-02 +/- 1.9e-03  |
 test_perturb_2                          |     +0.13 +/- 1.75     | +2.57e-02 +/- 3.44e-01 |  1.97e+01 +/- 2.7e-01  |  1.97e+01 +/- 2.2e-01  |
 test_proximal_freeb_jac                 |     -0.17 +/- 1.19     | -1.25e-02 +/- 8.80e-02 |  7.40e+00 +/- 7.6e-02  |  7.42e+00 +/- 4.5e-02  |
 test_solve_fixed_iter                   |     -4.16 +/- 1.77     | -1.38e+00 +/- 5.86e-01 |  3.18e+01 +/- 3.6e-01  |  3.31e+01 +/- 4.6e-01  |
 test_LinearConstraintProjection_build   |     +0.75 +/- 2.75     | +7.66e-02 +/- 2.82e-01 |  1.03e+01 +/- 2.0e-01  |  1.03e+01 +/- 2.0e-01  |
 test_build_transform_fft_midres         |     +0.62 +/- 2.91     | +3.80e-03 +/- 1.79e-02 |  6.21e-01 +/- 1.5e-02  |  6.17e-01 +/- 9.4e-03  |
 test_build_transform_fft_highres        |     +0.18 +/- 2.79     | +1.75e-03 +/- 2.74e-02 |  9.83e-01 +/- 1.4e-02  |  9.81e-01 +/- 2.4e-02  |
 test_equilibrium_init_lowres            |     -0.65 +/- 1.75     | -2.54e-02 +/- 6.88e-02 |  3.92e+00 +/- 4.0e-02  |  3.94e+00 +/- 5.6e-02  |
 test_objective_compile_atf              |     +0.72 +/- 3.91     | +5.86e-02 +/- 3.18e-01 |  8.19e+00 +/- 6.2e-02  |  8.13e+00 +/- 3.1e-01  |
 test_objective_compute_atf              |     +0.44 +/- 3.08     | +7.02e-05 +/- 4.90e-04 |  1.60e-02 +/- 3.7e-04  |  1.59e-02 +/- 3.2e-04  |
 test_objective_jac_atf                  |     +0.91 +/- 2.53     | +1.79e-02 +/- 4.97e-02 |  1.99e+00 +/- 4.0e-02  |  1.97e+00 +/- 2.9e-02  |
 test_perturb_1                          |     +0.05 +/- 1.07     | +8.17e-03 +/- 1.61e-01 |  1.50e+01 +/- 8.0e-02  |  1.49e+01 +/- 1.4e-01  |
 test_proximal_jac_atf                   |     -0.32 +/- 0.72     | -2.65e-02 +/- 5.97e-02 |  8.26e+00 +/- 4.4e-02  |  8.29e+00 +/- 4.1e-02  |
 test_proximal_freeb_compute             |     +0.18 +/- 1.14     | +3.71e-04 +/- 2.29e-03 |  2.02e-01 +/- 2.0e-03  |  2.01e-01 +/- 1.2e-03  |
+test_solve_fixed_iter_compiled          |     -7.13 +/- 1.19     | -1.58e+00 +/- 2.63e-01 |  2.05e+01 +/- 1.1e-01  |  2.21e+01 +/- 2.4e-01  |

f0uriest · 2024-11-13T17:37:31Z

isn't this the same as the cho option but using LU instead of cholesky?

YigitElma · 2024-11-13T17:57:11Z

isn't this the same as the cho option but using LU instead of cholesky?

Yes, pretty similar. jax.scipy.linlag.solve with symmetric A should call LAPACK ?sysv (docs), and it is something like LU. For finding alpha, I used Secant method.

Cholesky is also really fast (maybe faster than this one), but maybe this is more accurate, that's why I wanted to test it. If I remember correctly the problem with Cholesky was the accuracy.

f0uriest · 2024-11-13T18:02:26Z

Generally the innaccuracy comes from explitly forming J^T J when J is poorly conditioned, so I don't think you'll see any major improvements with this. Also cholesky should be faster.

…e-10 factor

…e/tr_direct

dpanici · 2024-11-20T18:36:21Z

run on multiple threads

…e/tr_direct

codecov · 2024-11-25T03:55:09Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 95.61%. Comparing base (93d2564) to head (fad3062).
Report is 43 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1363      +/-   ##
==========================================
- Coverage   95.61%   95.61%   -0.01%     
==========================================
  Files          98       98              
  Lines       25420    25417       -3     
==========================================
- Hits        24306    24303       -3     
  Misses       1114     1114

Files with missing lines	Coverage Δ
desc/examples/__init__.py	`100.00% <100.00%> (ø)`
desc/optimize/tr_subproblems.py	`99.43% <100.00%> (-0.02%)`	⬇️

... and 3 files with indirect coverage changes

YigitElma · 2024-12-12T23:51:21Z

tests/test_optimizer.py

@@ -1414,7 +1414,8 @@ def test_optimize_coil_currents(DummyCoilSet):
        verbose=2,
        copy=True,
    )
-    # check that optimized coil currents changed by more than 15% from initial values
+    # check that on average optimized coil currents changed by more than


#1406 made the same change I used here, the change of comment is from my version.

YigitElma · 2024-12-17T05:05:26Z

It is a bit hard to give proper speed improvement from this PR. It depends on the case. For example, currently our longest regression test, test_ATF_results used to take 4.5 mins on my laptop with GPU, now it takes 3mins. The final equilibrium is slightly different around the axis but the force error is lower.

eq0 = get("ATF")
eq = Equilibrium(
    Psi=eq0.Psi,
    NFP=eq0.NFP,
    L=eq0.L,
    M=eq0.M,
    N=eq0.N,
    L_grid=eq0.L_grid,
    M_grid=eq0.M_grid,
    N_grid=eq0.N_grid,
    pressure=eq0.pressure,
    iota=eq0.iota,
    surface=eq0.get_surface_at(rho=1),
    sym=eq0.sym,
    spectral_indexing=eq0.spectral_indexing,
)
eqf = EquilibriaFamily.solve_continuation_automatic(
    eq, verbose=2, jac_chunk_size=500
)

Force error after pr: 1.1110e-04
Force error after master: 2.7513e-04

However, the HELIOTRON test is not affected as much as ATF. 2min 42 sec to 2min 30sec.

dpanici · 2024-12-17T15:50:15Z

It is a bit hard to give proper speed improvement from this PR. It depends on the case. For example, currently our longest regression test, test_ATF_results used to take 4.5 mins on my laptop with GPU, now it takes 3mins. The final equilibrium is a bit different but the force error is lower.
Force error after pr: 1.1110e-04
Force error after master: 2.7513e-04
However, the HELIOTRON test is not affected as much as ATF. 2min 42 sec to 2min 30sec.

was the difference in ATF due to some bug?

YigitElma · 2024-12-17T17:10:51Z

was the difference in ATF due to some bug?

I don't think it is a bug. Just slight numerical differences cause small discrepancy around axis region of ATF.

add new trust region subproblem solver

ee24eeb

YigitElma added performance New feature or request to make the code faster enhancement General label for enhancement. Please also tag with "Speed", "Interface", "Functionality", etc labels Nov 13, 2024

YigitElma self-assigned this Nov 13, 2024

YigitElma added 3 commits November 13, 2024 02:33

update initial alpha

ae15c72

apply limits to alpha

ad5b29b

add execute_on_cpu decorator to des.example.get

141523d

YigitElma added 2 commits November 13, 2024 12:48

fix alpha updating

24f7b42

update docs

cce520e

YigitElma and others added 5 commits November 13, 2024 15:18

reduce rtol, not a fix but improves some tests

2854dac

use qr for main decomposition, use direct for tr subproblem, remove 1…

6b02b61

…e-10 factor

Merge branch 'master' into yge/tr_direct

4986cce

reduce rtol again

cac7612

Merge branch 'yge/tr_direct' of github.com:PlasmaControl/DESC into yg…

433768e

…e/tr_direct

dpanici and others added 9 commits November 20, 2024 13:36

Merge branch 'master' into yge/tr_direct

1a1b977

full direct

4954c73

Merge branch 'yge/tr_direct' of github.com:PlasmaControl/DESC into yg…

0c63bf7

…e/tr_direct

probably a stupid idea but anyway

fa6a733

at least try regularized

2b29060

Merge branch 'master' into yge/tr_direct

1e27266

update

22421ac

change initial alpha for qr

bd9fab3

fix typo

aa62a0a

remove test method

2a2e57c

YigitElma force-pushed the yge/tr_direct branch from 2a6c99a to e64f764 Compare December 12, 2024 16:23

YigitElma and others added 3 commits December 12, 2024 11:23

Merge branch 'master' into yge/tr_direct

1512f87

test for unit test fix

2b23a5a

Merge branch 'master' into yge/tr_direct

8b83abd

YigitElma commented Dec 12, 2024

View reviewed changes

Merge branch 'master' into yge/tr_direct

11e8f57

dpanici previously approved these changes Dec 13, 2024

View reviewed changes

Merge branch 'master' into yge/tr_direct

75aa18a

YigitElma dismissed dpanici’s stale review via 75aa18a December 14, 2024 02:56

YigitElma marked this pull request as draft December 14, 2024 19:03

YigitElma added 2 commits December 16, 2024 22:28

Merge remote-tracking branch 'origin/master' into yge/tr_direct

043c4c4

apply same changes to svd and cho, add docs for QR method

8bdb064

YigitElma marked this pull request as ready for review December 17, 2024 03:51

YigitElma requested review from f0uriest and dpanici December 17, 2024 03:51

fix typo

cccc9d0

YigitElma added 4 commits December 17, 2024 01:30

revert changes to svd subproblem solver

0b590f5

fix missing parts

56b01ff

apply to SVD again, using wrong alpha cause the problem

1d31b09

make the same fix to cholesky solver

d774d77

YigitElma changed the title ~~Update trust_region_step_exact_qr~~ Update trust_region_step_exact_ solvers Dec 17, 2024

f0uriest approved these changes Dec 17, 2024

View reviewed changes

Merge branch 'master' into yge/tr_direct

fad3062

dpanici approved these changes Dec 18, 2024

View reviewed changes

YigitElma merged commit 34dbac0 into master Dec 18, 2024
25 checks passed

YigitElma deleted the yge/tr_direct branch December 18, 2024 23:45

YigitElma mentioned this pull request Dec 20, 2024

Update notebooks with new stopping criteria #1425

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update `trust_region_step_exact_` solvers #1363

Update `trust_region_step_exact_` solvers #1363

YigitElma commented Nov 13, 2024 •

edited

Loading

github-actions bot commented Nov 13, 2024 •

edited

Loading

f0uriest commented Nov 13, 2024

YigitElma commented Nov 13, 2024 •

edited

Loading

f0uriest commented Nov 13, 2024

dpanici commented Nov 20, 2024

codecov bot commented Nov 25, 2024 •

edited

Loading

YigitElma Dec 12, 2024

YigitElma commented Dec 17, 2024 •

edited

Loading

dpanici commented Dec 17, 2024

YigitElma commented Dec 17, 2024

Update trust_region_step_exact_ solvers #1363

Update trust_region_step_exact_ solvers #1363

Conversation

YigitElma commented Nov 13, 2024 • edited Loading

Misc

github-actions bot commented Nov 13, 2024 • edited Loading

f0uriest commented Nov 13, 2024

YigitElma commented Nov 13, 2024 • edited Loading

f0uriest commented Nov 13, 2024

dpanici commented Nov 20, 2024

codecov bot commented Nov 25, 2024 • edited Loading

Codecov Report

YigitElma Dec 12, 2024

Choose a reason for hiding this comment

YigitElma commented Dec 17, 2024 • edited Loading

dpanici commented Dec 17, 2024

YigitElma commented Dec 17, 2024

Update `trust_region_step_exact_` solvers #1363

Update `trust_region_step_exact_` solvers #1363

YigitElma commented Nov 13, 2024 •

edited

Loading

github-actions bot commented Nov 13, 2024 •

edited

Loading

YigitElma commented Nov 13, 2024 •

edited

Loading

codecov bot commented Nov 25, 2024 •

edited

Loading

YigitElma commented Dec 17, 2024 •

edited

Loading