You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried using the DependencyCRF in a learning setting which required me to differentiate through the marginals. This turned out to be really difficult to achieve. I noticed that the gradients computed for the marginals tended to be of high variance + larger than I would expect (even though I haven't deep-dived into the Eisner algorithm yet).
I wonder if this a feature of the Eisner algorithm or might potentially hint at a bug?
Below is a minimal example which showcases that the maximum gradient returned for the arcscores can be quite large, even if they are on a reasonable scale.
hi, sorry for the long delay here. I'm going to try to add some tests to make sure it is returning the right values. I don't have a great sense about whether this is a bug, underflow, or correct in this case.
Hi,
I tried using the DependencyCRF in a learning setting which required me to differentiate through the marginals. This turned out to be really difficult to achieve. I noticed that the gradients computed for the marginals tended to be of high variance + larger than I would expect (even though I haven't deep-dived into the Eisner algorithm yet).
I wonder if this a feature of the Eisner algorithm or might potentially hint at a bug?
Below is a minimal example which showcases that the maximum gradient returned for the arcscores can be quite large, even if they are on a reasonable scale.
The text was updated successfully, but these errors were encountered: