Cryptocurrencies

Overview

Accountability Accounting, a prominent investment bank, is interested in offering a new cryptocurrency investment portfolio for its customers. In this analysis, a report will be created that includes what cryptocurrencies are on the trading market and how they could be grouped to create a classification system for this new investment. The data will need to be processed to fit the machine learning models. Since there is no known output for what to look for, unsupervised learning will be used. A clustering algorithm will be used to group the cryptocurrencies, and data visualizations will be used to share the findings.

Results

Steps for analysis:

Preprocessing the Data for PCA
Reducing Data Dimensions Using PCA
Clustering Cryptocurrencies Using K-means
Visualizing Cryptocurrencies Results

Preprocessing the data

Read data into a DataFrame
Drop the "IsTrading" column
Remove the rows that have at least one null value
Create a new DataFrame that only holds the names of the cryptocurrencies
Use the get_dummies() method to create variables for the two text features, "Algorithm" and "ProofType" and store the results in a new DataFrame

# Use get_dummies() to create variables for text features.
X = pd.get_dummies(crypto_df, columns=['Algorithm', 'ProofType'])
X.head()

Then, use the StandardScaler fit_transform() function to standardize the features from the new DataFrame

# Standardize the data with StandardScaler().
crypto_scaled = StandardScaler().fit_transform(X)
print(crypto_scaled[0:5])

Reduce the data dimensions using PCA (Principal Component Analysis) algorithm

Apply PCA to reduce the dimensions to three principal components

# Using PCA to reduce dimension to three principal components.
pca = PCA(n_components= 3)
crypto_pca = pca.fit_transform(crypto_scaled)
crypto_pca

Create a new DataFrame and use the same index as the previous DataFrame and columns named "PC 1", "PC 2", and "PC 3"

# Create a DataFrame with the three principal components.
pcs_df = pd.DataFrame(data = crypto_pca, columns= ['pc1', 'pc2', 'pc3'],index= crypto_df.index)
pcs_df.head(10)

Clustering Cryptocurrencies Using K-means

Using the previous DataFrame, create an elbow curve using hvPlot and a for loop to find the best value for K

Run the K-means algorithm to make predictions of the K clusters for the cryptocurrencies’ data

# Initialize the K-Means model.
model = KMeans(n_clusters=4, random_state=0)
# Fit the model
model.fit(pcs_df)
# Predict clusters
predictions = model.predict(pcs_df)
predictions

Create a new DataFrame by concatenating the crypto_df and pcs_df DataFrames on the same columns

# Create a new DataFrame including predicted clusters and cryptocurrencies features.
# Concatentate the crypto_df and pcs_df DataFrames on the same columns.
clustered_df = pd.concat([crypto_df, pcs_df],axis =1)

Add another column named "Class" that will hold the predictions

Visualizing Cryptocurrencies Results

Create a 3D scatter plot using the Plotly Express scatter_3d() function to plot the three clusters from the clustered_df DataFrame. Add the CoinName and Algorithm columns to the hover_name and hover_data parameters, respectively, so each data point shows the CoinName and Algorithm on hover.

Create an hvplot scatter plot with x="TotalCoinsMined", y="TotalCoinSupply", and by="Class", and have it show the CoinName when you hover over the the data point.

Summary

Cryptocurrencies are increasing in popularity and complexity, and the ability to understand and market them to alient will be key to any financial institution's growth. As more and more people look to invest in crypto, having the knowledge of which currencies are on the market and which ones would benefit a specific client will put any institution in a great position to become an industry leader.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
.ipynb_checkpoints		.ipynb_checkpoints
README.md		README.md
crypto_clustering.ipynb		crypto_clustering.ipynb
crypto_data.csv		crypto_data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cryptocurrencies

Overview

Results

Preprocessing the data

Reduce the data dimensions using PCA (Principal Component Analysis) algorithm

Clustering Cryptocurrencies Using K-means

Visualizing Cryptocurrencies Results

Summary

About

Releases

Packages

Languages

JennyJohnson78/Cryptocurrencies

Folders and files

Latest commit

History

Repository files navigation

Cryptocurrencies

Overview

Results

Preprocessing the data

Reduce the data dimensions using PCA (Principal Component Analysis) algorithm

Clustering Cryptocurrencies Using K-means

Visualizing Cryptocurrencies Results

Summary

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages