You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.
The implementation of incremental decoding in gluon-nlp is somewhat different from fairseq. In fairseq, the keys/values both before and after linear projection are memorialized, but in gluon-nlp, only the keys/values before the linear projection is memorialized. This difference leads to different execution number of FC operators (In fairseq, keys/values are directly pulled from prev_keys/prev_values; In gluon-nlp, two more linear projections are needed to get the projectioned keys/values). We may need to correct the gluon-nlp's implementation of incremental decoding.
The text was updated successfully, but these errors were encountered:
Description
The implementation of incremental decoding in gluon-nlp is somewhat different from fairseq. In fairseq, the keys/values both before and after linear projection are memorialized, but in gluon-nlp, only the keys/values before the linear projection is memorialized. This difference leads to different execution number of FC operators (In fairseq, keys/values are directly pulled from prev_keys/prev_values; In gluon-nlp, two more linear projections are needed to get the projectioned keys/values). We may need to correct the gluon-nlp's implementation of incremental decoding.
The text was updated successfully, but these errors were encountered: