An AI based tweet generator which can generate collection of tweets automatically. It can be used to replace the usernames and texts of the real twitter data provided by the user with fake generated usernames and texts. The tweets are generated using models learned from real tweets using the textgenrnn
library.
textgenrnn
, which can be installed usingpip install textgenrnn
ujson
, which can be installed usingpip install ujson
tensorflow
, which can be installed usingpip install tensorflow
- If other libraries are missing, install them using
pip install <library>
Users must provide the raw twitter data downloaded from Twitter and unzip it to a json file. This json file will be fed to both the trainer and the generator.
The downloaded Twitter data is in .gz
format (e.g. Tweet_2019-07-15_08-17-17.gz
). To unzip the data to a json file, use command gunzip -c <filename>.gz > <filename>.json
(e.g. gunzip -c Tweet_2019-07-15_08-17-17.gz > tweets.json
).
Python version must be at least Python 3.6.
- IMPORTANT: create a subdirectory called
weights
in the same directory as the scripts for the pipeline to work. - Run
tweet_trainer.py
to train a model using thetextgenrnn
library. It trains the model using a sample of sizek
from the json file containing the raw twitter data, which is provided by the user. It takes three parameters:-i
-- the json file storing the original tweets (e.g.tweets.json
),-k
-- the training sample size, and-e
-- the training epoch. An example ispython3 tweet_trainer.py -i tweets.json -k 20000 -e 5
. Please usepython3 tweet_trainer.py --help
for a detailed description of the script and its parameters. - Run
tweet_generator.py
to replace tweets with fake usernames and texts. The fake usernames are generated by the helper functionname_generator.py
, and the fake texts are generated using the trained model. The script generates the tweets and writes them into another json file in batches of sizek
, which is specified by the user. It takes three parameters:-i
-- the json file storing the original tweets (e.g.tweets.json
),-o
-- the json file to store the fake tweets (e.g.fake_tweets.json
), and-k
-- the batch size to generate. An example ispython3 text_generator.py -i tweets.json -o fake_tweets.json -k 12000
. Please usepython3 text_generator.py --help
for a detailed description of the script and its parameters.
- name_generator.py
- tweet_trainer.py
- tweet_generator.py