This repository contains Tensorflow code for our paper RecSys'23 "gSASRec: Reducing Overconfidence in Sequential Recommendation Trained with Negative Sampling"
Link to the paper:https://arxiv.org/pdf/2308.07192.pdf
If you use this code from the repository, please cite the work:
@inproceedings{petrov2023gsasrec,
title={gSASRec: Reducing Overconfidence in Sequential Recommendation Trained with Negative Sampling},
author={Petrov, Aleksandr Vladimirovich and Macdonald, Craig},
booktitle={Proceedings of the 17th ACM Conference on Recommender Systems},
pages={116--128},
year={2023}
}
If you are looking for a pytorch version of gSASRec, please use the official port: https://github.com/asash/gSASRec-pytorch/ The pytorch version is independent of the aprec framework and may be easier to use outside of the framework.
To set the environment, you can use Dockerfile
from the docker
folder and build the image with the help of the docker build command:
docker build . -t gsasrec
Alternatively, the Dockerfile
can be seen as a step-by-step instruction to set up the environment on your machine.
Our code is based on the aprec
framework from our recent reproducibility work, so you can use the original documentation to learn how to use the framework.
gSASRec is a SASRec-based sequential recommendation model that utilises more negatives per positive and gBCE loss:
where
The
where
The additional hyperparameters in gSASRec (in addition to standard SASRec's hyperparameters) are
However, if you want fully calibrated probabilities (e.g., not just to sort items but to use these probabilities as an approximation, e.g. for CTR), you should set
We do not implement gBCE explicitly. Instead, we use score positive conversion and then use the standard BCE loss:
where
Our SASRec code is based on the original SASRec code.
The most important code that implements the gBCE loss function that you can re-use in other projects:
alpha = self.model_parameters.vanilla_num_negatives / (self.data_parameters.num_items - 1)
t = self.model_parameters.vanilla_bce_t
beta = alpha * ((1 - 1/alpha)*t + 1/alpha)
positive_logits = tf.cast(logits[:, :, 0:1], 'float64') #use float64 to increase numerical stability
negative_logits = logits[:,:,1:]
eps = 1e-10
positive_probs = tf.clip_by_value(tf.sigmoid(positive_logits), eps, 1-eps)
positive_probs_adjusted = tf.clip_by_value(tf.math.pow(positive_probs, -beta), 1+eps, tf.float64.max)
to_log = tf.clip_by_value(tf.math.divide(1.0, (positive_probs_adjusted - 1)), eps, tf.float64.max)
positive_logits_transformed = tf.math.log(to_log)
negative_logits = tf.cast(negative_logits, 'float64')
logits = tf.concat([positive_logits_transformed, negative_logits], -1)
The code of our gSASRec model is located in the file recommenders/sequential/models/sasrec/sasrec.py
Note that when you use gBCE, the model may require some time to "kick-off" training and improve the results above simple models like popularity.
If you observe this pattern, consider increasing early stopping patience - the model eventually will start learning. Alternatively, consider decreasing t in gBCE to make task easier for the model.
(instruction copied from the original repo)
cd <your working directory>
cd aprec/evaluation
You need to run run_n_experiments.sh
with the experiment configuration file. Here is how to do it with an example configuration:
sh run_n_experiments.sh configs/ML1M-bpr-example.py
to analyse the results of the latest experiment run
python3 analyze_experiment_in_progress.py
The config files for experiments described in the paper are in the configs/gsasrec/
.
To run the experiments, please run.
MovieLens-1M:
sh run_n_experiments.sh configs/gsasrec/ml1m_benchmark.py
Steam:
sh run_n_experiments.sh configs/gsasrec/steam_benchmark.py
Gowalla:
sh run_n_experiments.sh configs/gsasrec/gowalla_benchmark.py