Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change the mask to negative infinite according to paper, and shift th… #3

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Blockhead-yj
Copy link

…e response when training model to avoid label leakage

Hi, Shivanandmn! Thank you for your kind sharing.
I trained a model on riiid dataset using your raw code, but it seems happened label leakage, for that accuracy in traing set and validation set are all close to 100%.
image

After checking your code, I made some changes myself. I shifted the input response matrix and add a start token "2" in the first column, so that the model can only access the former response record rather than current response. After this change, I retrained the model, and it turned out worked. It achieved 92.9% accuracy in training and 72.2% accuracy in validation set, which is corresponding to the SAINT+ paper.
image

I'm not sure if i make myself clear, because my English is poor. If you have any question, please let me know.

…e response when training model to avoid label leakage
@mbenami
Copy link

mbenami commented Dec 9, 2021

Hi @Blockhead-yj Thanks for this fix
I had that issue also

btw
do you know if I would like to use the model to predict
in case I have a new user that had N interaction and I would like to predict the user results
on question N+1 and category N+1
should I feed those N+1 (and N answers) to the model and just look at the last value of the output?

something like this?

# x and y are already process
def predict(x, y):
    out = torch.sigmoid(model(x, y))
    return out[-1][-1]

Thanks again

@Blockhead-yj
Copy link
Author

My answer is yes according to my understanding of this model. @mbenami

@ZhuoxueQAQ
Copy link

@Blockhead-yj i got the same problem,respect!

@xjtu-ygq
Copy link

xjtu-ygq commented Dec 6, 2022

My answer is yes according to my understanding of this model. @mbenami

你好,这份代码只有train和validation部分,请问是否有test预测的部分代码分享,非常感谢!

@Blockhead-yj
Copy link
Author

My answer is yes according to my understanding of this model. @mbenami

你好,这份代码只有train和validation部分,请问是否有test预测的部分代码分享,非常感谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants