This repository has been archived by the owner on Apr 25, 2023. It is now read-only.

Why using relu to compute additaive attention #28

Open

yuboona opened this issue May 2, 2020 · 0 comments

yuboona commented May 2, 2020

1、Attention's formula

In Normal Additive version, the attention score as follow:

score = v * tanh(W * [hidden; encoder_outputs])

In your code

score = v * relu(W * [hidden; encoder_outputs])

2、question

Is there some trick here? or this is a result after experimental comparision.

The text was updated successfully, but these errors were encountered:

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.