This is the repository for the Udacity Data Science Nanodegree. The project is part of a submission for the course in conjunction with the Medium article, which can be found here: https://medium.com/@lukasz.aszyk/this-is-what-5-years-of-elons-musk-tweets-looks-like-part-1-176a8279cefb
I want to answer a few questions with this project, which are:
- What's Elon Musk's tweeting style?
- What tweeting pattern yields the highest engagement?
- What are the most popular words used in his tweets?
I hope you will join the journey of exploring Elon's tweets which is the Kaggle dataset I've chosen for the exercise.
Elon Musk's tweets.ipynb - The Jupyter notebookelon_musk_origin.csv - The original source file with Elon Musk's tweets
|--Data-Science-Project-1
| |-- Elon Musk's tweets.ipynb
| |-- elon_musk_origin.csv
| |-- README.md
Create the environment and clone the Github repository in the source folder.
Creating a dedicated Python environment is a must be for a good Data Scientist. As a Software Developer, you have to follow the best practices not only for creating the easy to read and maintainable code but also for building your credibility. As this is not a purpose for this post, I will just share this and this link to follow the documentation and the Medium tutorial which explains the process in detail.
Further, download the Jupyter Notebook and all the necessary libraries, and open the Elon Musk's tweets.ipynb file for the analysis.
You can find the instructions here (https://jupyter.org/install.html).
Install necessary Python libraries:
alt==0.0.4
altair==4.1.0
altair-catplot==0.0.5
anyio==2.0.2
appdirs==1.4.4
appnope==0.1.2
argon2-cffi==20.1.0
async-generator==1.10
attrs==20.3.0
backcall==0.2.0
bleach==3.2.1
CacheControl==0.12.6
cachy==0.3.0
certifi==2020.6.20
cffi==1.14.4
chardet==4.0.0
cleo==0.8.1
click==7.1.2
clikit==0.6.2
crashtest==0.3.1
ddt==1.4.1
decorator==4.4.2
defusedxml==0.6.0
distlib==0.3.1
entrypoints==0.3
filelock==3.0.12
html5lib==1.1
idna==2.10
ipykernel==5.4.2
ipython==7.19.0
ipython-genutils==0.2.0
jedi==0.17.2
Jinja2==2.11.2
joblib==1.0.0
json5==0.9.5
jsonschema==3.2.0
jupyter-client==6.1.7
jupyter-core==4.7.0
jupyter-server==1.1.1
jupyterlab==2.2.9
jupyterlab-pygments==0.1.2
jupyterlab-server==1.2.0
keyring==21.8.0
lockfile==0.12.2
MarkupSafe==1.1.1
mistune==0.8.4
msgpack==1.0.2
nbclient==0.5.1
nbconvert==6.0.7
nbformat==5.0.8
nest-asyncio==1.4.3
nltk==3.5
notebook==6.1.5
numpy==1.19.4
packaging==20.8
pandas==1.2.0
pandocfilters==1.4.3
parso==0.7.1
pastel==0.2.1
patsy==0.5.1
pexpect==4.8.0
pickleshare==0.7.5
pipenv==2020.8.13
pkginfo==1.6.1
plotly==4.14.1
poetry==1.1.4
poetry-core==1.0.0
prometheus-client==0.9.0
prompt-toolkit==3.0.8
ptyprocess==0.6.0
pycparser==2.20
Pygments==2.7.3
pylev==1.3.0
pyparsing==2.4.7
pyrsistent==0.17.3
python-dateutil==2.8.1
pytz==2020.5
PyYAML==5.3.1
pyzmq==20.0.0
regex==2020.11.13
requests==2.25.1
requests-toolbelt==0.9.1
retrying==1.3.3
scikit-learn==0.24.0
scipy==1.5.4
Send2Trash==1.5.0
shellingham==1.3.2
six==1.15.0
sklearn==0.0
sniffio==1.2.0
statsmodels==0.12.1
terminado==0.9.1
testfixtures==6.17.0
testpath==0.4.4
threadpoolctl==2.1.0
tomlkit==0.7.0
toolz==0.11.1
tornado==6.1
tqdm==4.55.0
traitlets==5.0.5
urllib3==1.26.2
virtualenv==20.0.35
virtualenv-clone==0.5.4
voila==0.2.4
wcwidth==0.2.5
webencodings==0.5.1
The dataset is sufficient enoguh to answer the posed questions. Please find them in the notebook attached and in the Medium article.
Open license Lukasz Aszyk Thank you for the data Kaggle (https://www.kaggle.com/vidyapb/elon-musk-tweets-2015-to-2020)Thank you Elon Musk for the tweets