-
Retail
-
Electronics
-
Sales
-
Python
-
Power BI
-
Tableau
- Google Colab
-
Pandas
-
Matplotlib
In this repository, I will run a Sales Analysis and Visualization in Python on 200,000 sales data points to come up answers to these Revenue questions.
This project was used for my training when working as a Data Analyst Intern at Center of Talent in AI using Python, Power BI, Tableau in 2020.
12 CSV files for 12 months of Sales data in 2019 with the same columns as per my Github's folder
Order ID | Product | Quantity Ordered | Price Each | Order Date | Purchase Address |
---|
-
Python
-
Tableau
-
Power BI
- Visualized & compared results between charts in Tableau & Power BI to determine that the variables which caused the highest Sales Value as per hypotheses:
-
December at US$ 4.6 million due to holiday season with more promotions & newly launched products by electronics corporations in Quarter 4,
-
San Francisco at US$ 8.2 million as Silicon Valley is based there having a density of engineers in San Francisco with high income & spending on hi-tech products,
-
Peak hours placing orders at 11 AM & 12 PM (during lunch break) & at 7 PM after work,
-
Top sold products: iPhone, Lightning Charging Cable by 882 items ordered at the same time within the same order (suggested strategy: selling a combo of best sellers & less common items),
-
The correlation between Prices & Volumes as AA & AAA Batteries had the highest volumes ordered while other pricey products had both low.
- Drop NaN values from DataFrame
- Removing rows based on a condition
- Change the type of columns (to_numeric, to_datetime, astype)
1/ What was the best month for Sales? How much was earned on that month?
2/ Which city had the best Sales of the year?
3/ When was the most efficient time to display ads to maximize/optimize the possibility of customers' decisions to buy products?
4/ Which products were most sold together?
5/ Which product was sold the most? Why do you think it was the best-selling product?
- Concatenating multiple csvs together to create a new DataFrame (pd.concat)
- Adding columns
- Parsing cells as strings to make new columns (.str)
- Using the .transform() method
- Using groupby to perform aggregate analysis
- Plotting bar charts and lines graphs to visualize results
- Labeling graphs
December had the highest Sales Value at US$ 4,613,443.5. To dive deeper into why December had the highest Sales Value. We can analyze further by the following hypotheses:
- Holiday season (for example Christmas, New Year, etc.) might affect and have a positive correlation with the highest sales of the year, namely in December.
- Electronics corporations launch new products in Quarter 4 so people tend to spend more in December.
San Francisco had the highest Sales at US$ 8,262,204. This can be explained by some possible reasons:
- Silicon Valley is based in San Francisco.
- There is a density of engineers in San Franciso with high income. Therefore, they tend to spend more in this city, especially on hi-tech products.
The peak of placing Orders was at 11 AM and 12 PM (during lunch break), and at 7 PM, when people got home after work. Therefore, running an ad within 30-60 minutes before these two periods can be recommended to get the most viewers.
The top-selling products were iPhone, Lightning Charging Cable by 882 items sold at the same time within the same order. Companies can sell their products in a combo to push sales for both best sellers and the less common items. Or they can place promotion on a combo while keeping the regular price of their single purchased product to raise demand for buying a combo rather than a single item.
There are seasonal fluctuations which show constant drops from Apr to Aug for almost all products in general. Another lowest decline happened in December after a peak in November. The reasons might be because customers tend to spend for products on or before Thanksgiving in October/November and Chirstmans Sales in December. Then we could spend less on products in December but other priorities such as clothing, food and berage.
A suggestion might be that we can offer some seasonal rapid promotions for some products in summer so customers can spend for some products like batteries, dryers and headphones to solve the low quantities sold during those months.
Seasonality refers to a pattern of fluctuations above or below some average level that repeats at regular intervals. These “seasons” are viewed as short term patterns.
There is a correlation between the Products' Prices and their Volumes. The AA and AAA Batteries had the highest Volumes Ordered while other pricey products had both low quantities and revenues. Easily-replaceable and affordableproducts like batteries, cables, headphones had more Quantities Sold than other long-used and expensive products like Smartphones, TVs, Laptops, Monitors and Dryers.
As I do not have other data of other factors or criteria such as brand quality to research and analyze further, so far that is the sum-up of our conclusion.