-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCommuterChallenge
126 lines (107 loc) · 4.82 KB
/
CommuterChallenge
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
1. Trend Analysis Over Time
Track the ridership and traffic data to see how various transportation modes have recovered over time relative to pre-pandemic levels.
SELECT
Date,
AVG(`Subways: % of Comparable Pre-Pandemic Day`) AS avg_subways_percentage,
AVG(`Buses: % of Comparable Pre-Pandemic Day`) AS avg_buses_percentage,
AVG(`Bridges and Tunnels: % of Comparable Pre-Pandemic Day`) AS avg_bridges_tunnels_percentage
FROM
omgitsbeesdata.DailyRidership.Ridership
GROUP BY
Date
ORDER BY
Date;
2. Identify Peak Usage Days
Find the days with the highest estimated ridership for each transportation mode. This can be useful to understand peak days and periods.
SELECT
Date,
MAX(`Subways: Total Estimated Ridership`) AS max_subways_ridership,
MAX(`Buses: Total Estimated Ridership`) AS max_buses_ridership,
MAX(`Bridges and Tunnels: Total Traffic`) AS max_bridges_traffic
FROM
omgitsbeesdata.DailyRidership.Ridership
GROUP BY
Date
ORDER BY
max_subways_ridership DESC
LIMIT 10;
3. Pre-Pandemic Recovery Comparison
Compare the recovery percentages across different transportation modes to see which modes are closest to pre-pandemic usage levels.
SELECT
Date,
`Subways: % of Comparable Pre-Pandemic Day` AS subway_recovery,
`Buses: % of Comparable Pre-Pandemic Day` AS buses_recovery,
`LIRR: % of Comparable Pre-Pandemic Day` AS lirr_recovery,
`Metro-North: % of Comparable Pre-Pandemic Day` AS metro_north_recovery,
`Staten Island Railway: % of Comparable Pre-Pandemic Day` AS staten_island_recovery
FROM
omgitsbeesdata.DailyRidership.Ridership
WHERE
Date BETWEEN '2023-01-01' AND '2023-12-31'
ORDER BY
Date;
4. Calculate Monthly Averages for Ridership
Aggregate ridership totals by month for each mode to observe monthly trends.
SELECT
EXTRACT(YEAR FROM Date) AS year,
EXTRACT(MONTH FROM Date) AS month,
AVG(`Subways: Total Estimated Ridership`) AS avg_subway_ridership,
AVG(`Buses: Total Estimated Ridership`) AS avg_buses_ridership,
AVG(`LIRR: Total Estimated Ridership`) AS avg_lirr_ridership,
AVG(`Metro-North: Total Estimated Ridership`) AS avg_metro_north_ridership,
AVG(`Staten Island Railway: Total Estimated Ridership`) AS avg_staten_island_ridership
FROM
omgitsbeesdata.DailyRidership.Ridership
GROUP BY
year, month
ORDER BY
year, month;
5. Analyze Correlations Between Transportation Modes
Check if there’s any correlation between ridership trends on different transportation types to identify possible relationships (e.g., if subway ridership increases, does bus ridership increase as well?).
SELECT
`Subways: Total Estimated Ridership` AS subway_ridership,
`Buses: Total Estimated Ridership` AS bus_ridership,
`Bridges and Tunnels: Total Traffic` AS bridges_tunnels_traffic,
`LIRR: Total Estimated Ridership` AS lirr_ridership,
`Metro-North: Total Estimated Ridership` AS metro_north_ridership,
`Staten Island Railway: Total Estimated Ridership` AS staten_island_ridership
FROM
omgitsbeesdata.DailyRidership.Ridership;
6. Identify Days of the Week with Highest Ridership
This could help in understanding commuter behavior and weekly patterns for transportation modes.
SELECT
EXTRACT(DAYOFWEEK FROM Date) AS day_of_week,
AVG(`Subways: Total Estimated Ridership`) AS avg_subway_ridership,
AVG(`Buses: Total Estimated Ridership`) AS avg_bus_ridership,
AVG(`LIRR: Total Estimated Ridership`) AS avg_lirr_ridership,
AVG(`Metro-North: Total Estimated Ridership`) AS avg_metro_north_ridership
FROM
omgitsbeesdata.DailyRidership.Ridership
GROUP BY
day_of_week
ORDER BY
day_of_week;
7. Calculate Yearly Ridership Change
Calculate the percentage change in ridership per year to observe long-term trends.
WITH yearly_totals AS (
SELECT
EXTRACT(YEAR FROM Date) AS year,
SUM(`Subways: Total Estimated Ridership`) AS total_subway_ridership,
SUM(`Buses: Total Estimated Ridership`) AS total_bus_ridership
FROM
omgitsbeesdata.DailyRidership.Ridership
GROUP BY
year
)
SELECT
year,
total_subway_ridership,
total_bus_ridership,
LAG(total_subway_ridership) OVER (ORDER BY year) AS prev_year_subway_ridership,
LAG(total_bus_ridership) OVER (ORDER BY year) AS prev_year_bus_ridership,
ROUND((total_subway_ridership - LAG(total_subway_ridership) OVER (ORDER BY year)) / LAG(total_subway_ridership) OVER (ORDER BY year) * 100, 2) AS subway_ridership_change,
ROUND((total_bus_ridership - LAG(total_bus_ridership) OVER (ORDER BY year)) / LAG(total_bus_ridership) OVER (ORDER BY year) * 100, 2) AS bus_ridership_change
FROM
yearly_totals
ORDER BY
year;