-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LBFGS #23
Comments
Hi Albert, Thanks for the bug report. You are right, this normalization/denormalization makes no sense. The l-bfgs approach originally used the Pseudo-Hessian as Hessian approximation, where I normalized the Hessian approximation and denormalized the search direction from l-bfgs similar to Brossier (2011): Best regards, Daniel |
But if we use Best regards, |
Hi Albert, Technically, you have to use the gradients, not preconditioned gradients in the l-bfgs recursion loops (see p. 95+96 in my georadar fwi presentation). You can then either compute an initial Hessian approximation based on model and gradient differences (p. 95) or use the Pseudo-Hessian (p. 96). However, the l-bfgs optimization also seem to converge when using preconditioned gradients, instead of pure gradients. Best regards, Daniel |
I got it. Thank you Daniel. Best, Albert |
Hi Albert, Your bug reports are always welcome. I have some questions:
The problem with the sign change between p- and y-component data is interesting. I will take a closer look into it. Best regards, Daniel |
Hi Daniel,
Best regards, Albert |
Hi Albert, Thanks for the infos, I just calculated the gradients for the 2D acoustic problem in pressure-velocity formulation according to The resulting gradients without data integration should look like this: The code modifications are already uploaded to the Github repository. The acoustic FWI results for the Marmousi-2 model using PCG, GRAD_FORM=2 and the workflow from the Marmousi-Quickstart tutorial, starting from the 1D initial model ... ... look reasonable for the Vp-model (right) and quite crappy for the density model (left). One reason might be that the data integration in the GRAD_FORM=1 gradients introduces low frequency content, missing in GRAD_FORM=2. Another issue is the simple inverse Hessian approximation. Next, I will check how the l-bfgs optimization will converge. If you suppress the density model updates in the workflow file, the density gradients are simply set to zero, but the number of parameter classes does not change. I will also check if removing the density updates from the l-bfgs optimzation will indeed improve its convergence. Best regards, Daniel |
Hi Albert, I have run the same Marmousi-2 test problem as above using l-BFGS instead of PCG optimization for the GRAD_FORM=2 gradients, inverting pressure component data. The resolution of the resulting vp model is a little bit improved, while the density model looks still crappy, even though in a different way Inverting only for the vp model by setting INV_RHO_ITER = 1000 at each inversion stage in the FWI workflow file improves the resolution significantly Of course, I used unrealistic ideal conditions for this FWI run, including a low pass filtered spike wavelet with low frequency content and an 1D initial model not introducing cycle-skipping. Nevertheless, the FWI seem to converge, so the l-BFGS implementation in DENISE for the acoustic problem seem to be correct, but might fail depending on the problem. It is still quite surprising that the density inversion converges much better when using vx-vy-component data and GRAD_FORM=1 so I would not rule out an implementation issue for the GRAD_FORM=2 density gradient. Best regards, Daniel |
Hi Daniel, Thank you for your great effort to check those issues. I guess the unstable simulation situation should occur in cycle-skipping case. I realized that the heavily Gaussian smoothed initial model and a single relatively broad frequency band (eg. 3-10Hz) in my experiments may lead to cycle-skipping. Best regards, |
Hi Daniel,
I feel puzzled about the normlization and denormalization of gradients in LBFGS. Why the gradients multiply
C_vp
andC_rho
in both normalization and denormalization? It will make the backtracking line search fail.The text was updated successfully, but these errors were encountered: