Evaluation is incorrect because it can see the label. [BUG] #738

korchi · 2023-08-08T09:54:23Z

Bug description

When trainer.evaluate() is called, the model can see all the inputs, including the targets, whose embeddings influence the all latent embeddings. I believe, that targets should be truncated to simulate the production environment.

Steps/Code to reproduce bug

Take any model and any sequence from a dataset.
To evaluate the model in a production-like environment split the sequence into input, target = sequence[:-1], sequence[-1] and run pred = trainer.evaluate(input_dataset).predictions[0] (on input_dataset created from the input sequence) and compute recall_simulated = recall(target, pred).
Evaluate model recall_eval = trainer.evaluate(sequence)
The resultrecall_eval.recall is different from recall_simulated, which shouldn't be.

Expected behavior

recall_eval.recall should return the same recall as recall_simulated

Environment details

Transformers4Rec version: 23.6.0
Platform: Linux + Docker image (nvcr.io/nvidia/merlin/merlin-pytorch:23.06)
Python version: 3.8.10
Huggingface Transformers version: 4.12.0
PyTorch version (GPU?): torch==2.0.1, pytorch-lightning==2.0.4
Tensorflow version (GPU?): --

Additional context

Find attached masking.patch file, which fixed the result discrepancy for me.

The text was updated successfully, but these errors were encountered:

rnyak · 2023-08-29T14:18:29Z

@korchi if you are truncating the target you should use trainer.predict(), which is using n-1 inputs to predict nth item. we are not masking anything if we use .predict().

However, if you are using trainer.evaluate() , we are automatically masking the last item under the hood, so that we generate prediction result for the last item in the given input. So you dont need to truncate the input sequence if you use .evaluate().

korchi added bug Something isn't working status/needs-triage labels Aug 8, 2023

korchi changed the title ~~Evaluation are incorrect because it can see the label. [BUG]~~ Evaluation is incorrect because it can see the label. [BUG] Aug 15, 2023

EvenOldridge closed this as completed Sep 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation is incorrect because it can see the label. [BUG] #738

Evaluation is incorrect because it can see the label. [BUG] #738

korchi commented Aug 8, 2023

rnyak commented Aug 29, 2023

Evaluation is incorrect because it can see the label. [BUG] #738

Evaluation is incorrect because it can see the label. [BUG] #738

Comments

korchi commented Aug 8, 2023

Bug description

Steps/Code to reproduce bug

Expected behavior

Environment details

Additional context

rnyak commented Aug 29, 2023