PyTorch 0.4 Implementation of Neural Chat: Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems. You can interact with the models here:
This repository is accompanied by Neural Chat Web to deploy a web server and host the models online.
This code is inspired by and built off of "A Hierarchical Latent Structure for Variational Conversation Modeling" (code, paper, presentation).
This section includes installation of required libraries, and downloading pre-trained models.
Install Python packages
pip install -r requirements.txt
Setup python path to include repo
python develop
Follow the instructions here to download PyTorch version (0.4.0) or by running
pip3 install torch===0.4.0 -f
For more information about InferSent module, see here.
Download GloVe [2.18GB] (V1) and the pre-trained InferSent models trained with GloVe.
mkdir inferSent/dataset/GloVe
curl -Lo inferSent/dataset/GloVe/
unzip inferSent/dataset/GloVe/ -d inferSent/dataset/GloVe/
curl -Lo inferSent/encoder/infersent1.pickle
You can instead download fastText [5.83GB] (V2) vectors and the corresponding InferSent model. We suggest using GloVe:
mkdir inferSent/dataset/fastText
curl -Lo inferSent/dataset/fastText/
unzip inferSent/dataset/fastText/ -d inferSent/dataset/fastText/
curl -Lo inferSent/encoder/infersent2.pickle
Note that infersent1 is trained with GloVe (which have been trained on text preprocessed with the PTB tokenizer) and infersent2 is trained with fastText (which have been trained on text preprocessed with the MOSES tokenizer). The latter also removes the padding of zeros with max-pooling which was inconvenient when embedding sentences outside of their batches.
For more information about TorchMoji module, see here.
Run the download script to downloads the pre-trained torchMoji weights [~85MB] from here and put them in the ./torchMoji/model/
python torchMoji/scripts/
The following scripts will:
Create directories
respectively. -
Download and preprocess conversation data inside each directory.
To download the pre-processed dataset [10.31GB], use:
python --dataset=reddit_casual --shortcut
Alternatively, if you'd like to download a smaller version [24.2MB], and do pre-processing steps on your end, use:
python --dataset=reddit_casual
--max_sentence_length (maximum number of words in sentence; default: 30)
--max_conversation_length (maximum turns of utterances in single conversation; default: 10)
--max_vocab_size (maximum size of word vocabulary; default: 20000)
--max_vocab_frequency (minimum frequency of word to be included in vocabulary; default: 5)
--n_workers (number of workers for multiprocessing; default: os.cpu_count())
To download the pre-processed dataset [3.61GB], use:
python --dataset=cornell --shortcut
Alternatively, if you'd like to download a smaller version [9.9MB], and do pre-processing steps on your end, use:
python --dataset=reddit_casual
--max_sentence_length (maximum number of words in sentence; default: 30)
--max_conversation_length (maximum turns of utterances in single conversation; default: 10)
--max_vocab_size (maximum size of word vocabulary; default: 20000)
--max_vocab_frequency (minimum frequency of word to be included in vocabulary; default: 5)
--n_workers (number of workers for multiprocessing; default: os.cpu_count())
Go to the model directory and set the save_dir in (this is where the model checkpoints will be saved).
By default, it will save a model checkpoint every epoch to <save_dir> and a tensorboard summary. For more arguments and options, see
Note that after training, you should only keep the single optimal checkpoint in the checkpoint directory for evaluation and interaction steps and remove the remaining checkpoints.
We provide our implementation of EI (Emotion+Infersent) models built upon implementations for VHCR, VHRED, and HRED.
To run training:
python --data=<data> --model=<model> --batch_size=<batch_size> [--emotion --infersent]
For example:
- Train HRED-Infersent-only on Cornell Movie:
python model/ --data=cornell --model=HRED --infersent --infersent_weight=25000 --infersent_embedding_size=128
- Train VHRED-Emotion-only on Reddit Casual Conversations:
python model/ --data=reddit_casual --model=VHRED --emotion --emo_weight=25 --emo_embedding_size=128
- Train VHCR-EI on Reddit Casual Conversations:
python model/ --data=reddit_casual --model=VHCR --emotion --infersent --emo_weight=25 --emo_embedding_size=128 --infersent_weight=100000 --infersent_embedding_size=4000
To evaluate the word perplexity:
python model/ --mode=<mode> --checkpoint=<path_to_your_checkpoint_directory>
For embedding based metrics, you need to download Google News word vectors, unzip it and put it under the datasets folder. Then run:
python model/ --mode=<mode> --checkpoint=<path_to_your_checkpoint_directory>
To evaluate sentiment and semantics using distance from torhcmoji and infersent inferred embedding:
python model/ --mode=<mode> --checkpoint=<path_to_your_checkpoint_directory>
Use the following command to interact with / talk to a saved model checkpoint:
python model/ --debug --checkpoint=<path_to_your_checkpoint_directory>
This code is accompanied by another repository that implements the server portion of Neural Chat project. Please refer to Neural Chat Web for details on how to deploy your chatbots live on web.
If you use this code or the released Reddit dataset, please reference one of the following papers:
For batch reinforcement learning in dialog systems, refer to:
title={Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog},
author={Jaques, Natasha and Ghandeharioun, Asma and Shen, Judy and Ferguson, Craig and Jones, Noah, and Lapedriza, Agata and Gu, Shixiang and Picard, Rosalind},
journal={arXiv preprint arXiv:},
For hierarchical reinforcement learning for open-domain dialog, refer to:
title={Hierarchical Reinforcement Learning for Open-Domain Dialog},
author={Saleh, Abdelrhman and Jaques, Natasha and Ghandeharioun, Asma and Shen, Judy and Picard, Rosalind},
journal={arXiv preprint arXiv:1909.07547},
For interactive evaluation, use of Reddit dataset, miscellaneous use-cases, refer to the following paper:
title={Approximating Interactive Human Evaluation with Self-Play for Open-Domain Dialog Systems},
author={Ghandeharioun, Asma and Shen, Judy and Jaques, Natasha and Ferguson, Craig and Jones, Noah, and Lapedriza, Agata and Picard, Rosalind},
journal={arXiv preprint arXiv:1906.09308},
- Park, Y., Cho, J., & Kim, G. (2018, June). A Hierarchical Latent Structure for Variational Conversation Modeling. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers) (pp. 1792-1801).
- Serban, I. V., Sordoni, A., Lowe, R., Charlin, L., Pineau, J., Courville, A., & Bengio, Y. (2017, February). A hierarchical latent variable encoder-decoder model for generating dialogues. In Thirty-First AAAI Conference on Artificial Intelligence.
- Sordoni, A., Bengio, Y., Vahabi, H., Lioma, C., Grue Simonsen, J., & Nie, J. Y. (2015, October). A hierarchical recurrent encoder-decoder for generative context-aware query suggestion. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (pp. 553-562). ACM.