Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Change how data is downloaded for Bitcoin tutorial #391

Merged
merged 4 commits into from
Jun 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,4 +22,3 @@ dependencies:
- fire
- tabulate
- tenacity
- cryptocmd
236 changes: 22 additions & 214 deletions nbs/docs/use-cases/2_bitcoin_price_prediction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -118,46 +118,7 @@
"source": [
"Bitcoin (₿) is the first decentralized digital currency and is one of the most popular cryptocurrencies. Transactions are managed and recorded on a public ledger known as the blockchain. Bitcoins are created as a reward for mining, a process that involves solving complex cryptographic tasks to verify transactions. This digital currency can be used as payment for goods and services, traded for other currencies, or held as a store of value.\n",
"\n",
"In this tutorial, we will first download the historical Bitcoin price data with `cryptocmd`, a Python package for downloading data from [CoinMarketCap](https://coinmarketcap.com/). To start, we need to define a `scraper`, selecting our cryptocurrency of interest and the start and end dates in format dd-mm-yyyy."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"::: {.callout-note}\n",
"You can install `cryptocmd` with `pip`:\n",
" \n",
"```python\n",
"pip install cryptocmd\n",
"```\n",
":::"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd \n",
"from cryptocmd import CmcScraper"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"scraper = CmcScraper('BTC', '01-01-2020', '31-12-2023')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next we create a `pandas` DataFrame with the data. Note that it is important to sort the data by date in ascending order. "
"In this tutorial, we will first download the historical Bitcoin price data in USD as a `pandas` DataFrame. "
]
},
{
Expand Down Expand Up @@ -187,121 +148,46 @@
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Open</th>\n",
" <th>High</th>\n",
" <th>Low</th>\n",
" <th>Close</th>\n",
" <th>Volume</th>\n",
" <th>Market Cap</th>\n",
" <th>Time Open</th>\n",
" <th>Time High</th>\n",
" <th>Time Low</th>\n",
" <th>Time Close</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1460</th>\n",
" <th>0</th>\n",
" <td>2020-01-01</td>\n",
" <td>7194.891971</td>\n",
" <td>7254.330611</td>\n",
" <td>7174.944153</td>\n",
" <td>7200.174393</td>\n",
" <td>1.856566e+10</td>\n",
" <td>1.305808e+11</td>\n",
" <td>2020-01-01T00:00:00.000Z</td>\n",
" <td>2020-01-01T15:42:01.000Z</td>\n",
" <td>2020-01-01T01:06:01.000Z</td>\n",
" <td>2020-01-01T23:59:59.999Z</td>\n",
" <td>7200.174316</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1459</th>\n",
" <th>1</th>\n",
" <td>2020-01-02</td>\n",
" <td>7202.551122</td>\n",
" <td>7212.155253</td>\n",
" <td>6935.269972</td>\n",
" <td>6985.470001</td>\n",
" <td>2.080208e+10</td>\n",
" <td>1.266994e+11</td>\n",
" <td>2020-01-02T00:00:00.000Z</td>\n",
" <td>2020-01-02T01:30:00.000Z</td>\n",
" <td>2020-01-02T23:02:01.000Z</td>\n",
" <td>2020-01-02T23:59:59.999Z</td>\n",
" <td>6985.470215</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1458</th>\n",
" <th>2</th>\n",
" <td>2020-01-03</td>\n",
" <td>6984.428612</td>\n",
" <td>7413.715099</td>\n",
" <td>6914.995908</td>\n",
" <td>7344.884183</td>\n",
" <td>2.811148e+10</td>\n",
" <td>1.332334e+11</td>\n",
" <td>2020-01-03T00:00:00.000Z</td>\n",
" <td>2020-01-03T17:04:00.000Z</td>\n",
" <td>2020-01-03T02:10:01.000Z</td>\n",
" <td>2020-01-03T23:59:59.999Z</td>\n",
" <td>7344.884277</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1457</th>\n",
" <th>3</th>\n",
" <td>2020-01-04</td>\n",
" <td>7345.375275</td>\n",
" <td>7427.385794</td>\n",
" <td>7309.514012</td>\n",
" <td>7410.656566</td>\n",
" <td>1.844427e+10</td>\n",
" <td>1.344425e+11</td>\n",
" <td>2020-01-04T00:00:00.000Z</td>\n",
" <td>2020-01-04T18:44:02.000Z</td>\n",
" <td>2020-01-04T00:39:02.000Z</td>\n",
" <td>2020-01-04T23:59:59.999Z</td>\n",
" <td>7410.656738</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1456</th>\n",
" <th>4</th>\n",
" <td>2020-01-05</td>\n",
" <td>7410.451694</td>\n",
" <td>7544.496872</td>\n",
" <td>7400.535561</td>\n",
" <td>7411.317327</td>\n",
" <td>1.972507e+10</td>\n",
" <td>1.344695e+11</td>\n",
" <td>2020-01-05T00:00:00.000Z</td>\n",
" <td>2020-01-05T18:57:00.000Z</td>\n",
" <td>2020-01-05T23:18:00.000Z</td>\n",
" <td>2020-01-05T23:59:59.999Z</td>\n",
" <td>7411.317383</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Date Open High Low Close \\\n",
"1460 2020-01-01 7194.891971 7254.330611 7174.944153 7200.174393 \n",
"1459 2020-01-02 7202.551122 7212.155253 6935.269972 6985.470001 \n",
"1458 2020-01-03 6984.428612 7413.715099 6914.995908 7344.884183 \n",
"1457 2020-01-04 7345.375275 7427.385794 7309.514012 7410.656566 \n",
"1456 2020-01-05 7410.451694 7544.496872 7400.535561 7411.317327 \n",
"\n",
" Volume Market Cap Time Open \\\n",
"1460 1.856566e+10 1.305808e+11 2020-01-01T00:00:00.000Z \n",
"1459 2.080208e+10 1.266994e+11 2020-01-02T00:00:00.000Z \n",
"1458 2.811148e+10 1.332334e+11 2020-01-03T00:00:00.000Z \n",
"1457 1.844427e+10 1.344425e+11 2020-01-04T00:00:00.000Z \n",
"1456 1.972507e+10 1.344695e+11 2020-01-05T00:00:00.000Z \n",
"\n",
" Time High Time Low \\\n",
"1460 2020-01-01T15:42:01.000Z 2020-01-01T01:06:01.000Z \n",
"1459 2020-01-02T01:30:00.000Z 2020-01-02T23:02:01.000Z \n",
"1458 2020-01-03T17:04:00.000Z 2020-01-03T02:10:01.000Z \n",
"1457 2020-01-04T18:44:02.000Z 2020-01-04T00:39:02.000Z \n",
"1456 2020-01-05T18:57:00.000Z 2020-01-05T23:18:00.000Z \n",
"\n",
" Time Close \n",
"1460 2020-01-01T23:59:59.999Z \n",
"1459 2020-01-02T23:59:59.999Z \n",
"1458 2020-01-03T23:59:59.999Z \n",
"1457 2020-01-04T23:59:59.999Z \n",
"1456 2020-01-05T23:59:59.999Z "
" Date Close\n",
"0 2020-01-01 7200.174316\n",
"1 2020-01-02 6985.470215\n",
"2 2020-01-03 7344.884277\n",
"3 2020-01-04 7410.656738\n",
"4 2020-01-05 7411.317383"
]
},
"execution_count": null,
Expand All @@ -310,95 +196,17 @@
}
],
"source": [
"df = scraper.get_dataframe()\n",
"df = df.sort_values('Date', ascending=True)\n",
"import pandas as pd \n",
"\n",
"df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/bitcoin_price_usd.csv', sep=',') \n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The `scraper` provides different details regarding the price of Bitcoin. Here, we will use the `Close` column as our target variable, although any other column could also be used. It's important to note that unlike traditional financial assets, Bitcoin trades 24/7. Therefore, the closing price represents the price of Bitcoin at a specific time each day, rather than at the end of a trading day."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Date</th>\n",
" <th>Close</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>1460</th>\n",
" <td>2020-01-01</td>\n",
" <td>7200.174393</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1459</th>\n",
" <td>2020-01-02</td>\n",
" <td>6985.470001</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1458</th>\n",
" <td>2020-01-03</td>\n",
" <td>7344.884183</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1457</th>\n",
" <td>2020-01-04</td>\n",
" <td>7410.656566</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1456</th>\n",
" <td>2020-01-05</td>\n",
" <td>7411.317327</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Date Close\n",
"1460 2020-01-01 7200.174393\n",
"1459 2020-01-02 6985.470001\n",
"1458 2020-01-03 7344.884183\n",
"1457 2020-01-04 7410.656566\n",
"1456 2020-01-05 7411.317327"
]
},
"execution_count": null,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = df[['Date', 'Close']]\n",
"df.head()"
"This dataset contains the closing price of Bitcoin in USD from 2020-01-01 to 2023-12-31. It's important to note that unlike traditional financial assets, Bitcoin trades 24/7. Therefore, the closing price represents the price of Bitcoin at a specific time each day, rather than at the end of a trading day."
]
},
{
Expand Down Expand Up @@ -1048,7 +856,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"As stated in the introduction, predicting the future prices of financial assets is a challenging task, especially for assets like Bitcoin. However, for those who need or want to forecast these assets, `TimeGPT` can be a powerful tool that simplifies the forecasting process. With just a couple of lines of code, `TimeGPT` can help you: \n",
"As stated in the introduction, predicting the future prices of financial assets is a challenging task, especially for assets like Bitcoin. The predictions in this tutorial seem very accurate, because we are doing historical forecasting. The real challenge is forecasting the price of Bitcoin for the upcoming days, not its historical price. For those who need or want to try to forecast these assets, `TimeGPT` can be an option that simplifies the forecasting process. With just a couple of lines of code, `TimeGPT` can help you: \n",
"\n",
"- Produce point forecasts \n",
"- Quantify the uncertainty of your predictions \n",
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
"neuralforecast",
"hierarchicalforecast",
"jupyterlab",
"setuptools<70",
]
distributed = ["dask[dataframe]", "fugue[ray]>=0.8.7", "pyspark", "ray[serve-grpc]"]
plotting = ["utilsforecast[plotting]>=0.1.7"]
Expand Down
Loading