-
Notifications
You must be signed in to change notification settings - Fork 4
/
README.Rmd
336 lines (245 loc) · 17.7 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
---
output:
github_document:
toc: TRUE
editor_options:
markdown:
wrap: sentence
---
```{=html}
<!--
Copyright 2022 Province of British Columbia
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and limitations under the License.
-->
```
<!-- Edit the README.Rmd only!!! The README.md is generated automatically from README.Rmd. -->
# bcgov/envair: BC air quality data retrieval and analysis tool
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```
# bcgovr
[![img](https://img.shields.io/badge/Lifecycle-Maturing-007EC6)](https://github.com/bcgov/repomountie/blob/master/doc/lifecycle-badges.md) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
## Overview
bcgov/envair is an R package developed by the BC Ministry of Environment and Climate Change Strategy Knowledge Management Branch/Environmental and Climate Monitoring Section (ENV/KMB/ECMS) air quality monitoring unit.
Package enables R-based retrieval and processing of [air quality monitoring data](https://envistaweb.env.gov.bc.ca/).
Output is now compatible with the popular [openair package](https://cran.r-project.org/web/packages/openair/openair.pdf).
## Installation
You can install `envair` directly from this GitHub repository.
To proceed, you will need the [remotes](https://cran.r-project.org/package=remotes) package:
```{r, eval=FALSE}
install.packages("remotes")
```
Next, install and load the `envair` package using `remotes::install_github()`:
```{r, eval=FALSE}
remotes::install_github("bcgov/envair")
library(envair)
```
## Features
- Retrieve data from the Air Quality Data Archive by specifying parameter (pollutant) or station.
Functions have the option to add Transboundary Flow Exceptional Event (TFEE) flags.
Data archive is located in the ENV's FTP server: <ftp://ftp.env.gov.bc.ca/pub/outgoing/AIR/>
- Generate annual metrics, data captures and statistical summaries following the Guidance Document on Achievement Determination and the Canadian Ambient Air Quality Standards
- Most of data processing functions can automatically process a parameter as input, or a dataframe of air quality data.
- Retrieve archived and current ventilation index data with options to generate a kml map.
## Functions
- `importBC_data()` Retrieves station or parameter data from specified year/s between 1980 and yesterday.
- for you can specify one or several parameters, or name of one or several stations
- if station is specified, output returns a wide table following the format of the openair package. It also renames all columns into lowercase letters, changes scalar wind speed to ws, and vector wind direction to wd. It also shifts the datetime to the time-beginning format
- if parameter is specified, output displays data from all air quality monitoring stations that reported this data.
- use *flag_TFEE = TRUE* to add a new boolean (TRUE/FALSE) column called *flag_tfee*. This option only works when you enter parameter (not station) in the parameter_or_station.
- use merge_Stations = TRUE to merge data from monitoring stations and corresponding alternative stations, especially in locations where the monitoring station was relocated. This function may also change the name of the air quality monitoring station.
- set *pad = TRUE* to pad missing dates, set *use_ws_vector = TRUE* to use vector wind speed instead of scalar, and set *use_openairformat = FALSE* to produce the original *non-openair* output.
- `importBC_data_avg()` Retrieves pollutant (parameter) data and performs statistical averaging based on the specified averaging_type
- function can retrieve 24-hour averages (24-hr), daily 1-hour maximum (d1hm), daily 8-hour maximum (d8hm), rolling 8-hour values (8-hr)
- it can also make annual summaries such as 98th percentile of daily 1-hour maximum, annual mean of 24-hour values. To perform annual summaries, the averaging_type should include "annual \<averaging/percentile\> \<1-hour or 24-hour or dxhm\>"
- function can calculate the number of times a certain value has been exceeded
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Type of averaging | *averaging_type=* Syntax | Output description |
+===========================================+==============================+==========================================================================+
| 1-hr | "1-hr" | Outputs hourly data. No averaging done. |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Daily Average | "24-hr" | Outputs the daily (24-hour) values. |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Rolling 8-hour | "8-hr" | Outputs hourly values that were calculated from rolling 8-hour average. |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Daily 1-hour maximum | "d1hm" | Outputs daily values of the highest 1-hour concentration |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Daily 8-hour maximum | "d8hm" | Outputs the daily 8-hour maximum for each day |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Annual mean of 1-hour values | "annual mean 1-hr" | Outputs the average of all hourly values |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Annual mean of daily values | "annual mean 24-hr" | Outputs average of all daily values. |
| | | |
| | "annual mean \<avging\>" | |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Annual 98th percentile of 1-hour values | "annual 98p 1-hr" | Outputs the 98th percentile of the 1-hour values. |
| | | |
| | "annual \<xxp\> \<avging\> | |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| 4th Highest daily 8-hour maximum | "annual 4th d8hm" | Outputs the 4th highest daily 8-hour maximum. |
| | | |
| | "annual \<rank\> \<avging\> | |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Number of daily values exceeding 28 µg/m3 | "exceed 28 24-hr" | Outputs the number of days where the 28 µg/m3 is exceeded |
| | | |
| | "exceed \<value\> \<avging\> | |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
| Number of exceedance to d8hm of 62ppb | "exceed 62 d8hm" | Outputs the number of days where the daily 8-hour maximum exceeds 62 ppb |
| | | |
| | "exceed \<value\> \<avging\> | |
+-------------------------------------------+------------------------------+--------------------------------------------------------------------------+
: List of possible values for the *averaging_type*.
- `get_stats()` Retrieves a statistical summary based on the default for the pollutant.
Currently applies to PM2.5, O3, NO2, and SO2.
Output includes data captures, annual metrics, and exceedances.
- `get_captures()` Calculates the data captures for a specified pollutant or dataframe.
Output is a long list of capture statistics such as hourly, daily, quarterly, and annual sumamaries.
- `listBC_stations()` Lists details of all air quality monitoring stations (active or inactive)
- `list_parameters()` Lists the parameters that can be imported by `importBC_data()`
- `importECCC_forecast()` Retrieves AQHI, PM2.5, PM10, O3, and NO2 forecasts from the ECCC datamart
- `get_venting_summary()` Summarizes the ventilation index, counting the number of GOOD, FAIR, or POOR days for the month
- `GET_VENTING_ECCC()` Retrieve the venting index FLCN39 from Environment and Climate Change Canada datamart or from the B.C.'s Open Data Portal
- `ventingBC_kml()` Creates a kml or shape file based on the 2019 OBSCR rules.
This incorporates venting index and sensitivity zones.
## Usage and Examples
#### `importBC_data()`
------------------------------------------------------------------------
##### Retrieving air quality data, include TFEE adjustment and combine related stations
> Use flag_TFEE = TRUE an merge_Stations = TRUE to produce a result that defines the TFEE and merges station and instruments, as performed during the CAAQS-reporting process.
```{r, eval=FALSE}
library(envair)
df_data <- importBC_data('pm25',years = 2015:2017, flag_TFEE = TRUE,merge_Stations = TRUE)
knitr::kable(df_data[1:4,])
```
##### Using *openair* package function on BC ENV data.
> By default, this function produces openair-compatible dataframe output.
> This renames *WSPD_VECT*,*WDIR_VECT* into *ws* and *wd*, changes pollutant names to lower case characters (e.g., *pm25*,*no2*,*so2*), and shifts the date from time-ending to time-beginning format.
> To use, specify station name and year/s.
> For a list of stations, use *listBC_stations()* function.
> If no year is specified, function retrieves latest data, typically the unverfied data from start of year to current date.
```{r, eval=FALSE}
library(openair)
PG_data <- importBC_data('Prince George Plaza 400',2010:2012)
pollutionRose(PG_data,pollutant='pm25')
```
```{r,echo = FALSE}
knitr::include_graphics('importBC_data.png')
```
##### Other features for station data retrieval
- To import without renaming column names, specify *use_openairformat = FALSE*. This also keeps date in time-ending format
- By default, *vector wind direction* and *scalar wind speeds* are used
- To use vector wind speed, use *use_ws_vector = TRUE*
- Station name is not case sensitive, and works on partial text match
- Multiple stations can be specified *c('Prince George','Kamloops')*
- For non-continuous multiple years, use *c(2010,2011:2014)*
```{r, eval = FALSE}
importBC_data('Prince George Plaza 400',2010:2012,use_openairformat = FALSE)
importBC_data('Kamloops',2015)
importBC_data(c('Prince George','Kamloops'),c(2010,2011:2014))
importBC_data('Trail',2015,pad = TRUE)
```
##### Retrieve parameter data
> Specify parameter name to retrieve data.
> Note that these are very large files and may use up your computer's resources.
> List of parameters can be found using the *list_parameters()* function.
```{r,eval=FALSE}
pm25_3year <- importBC_data('PM25',2010:2012)
```
#### `importBC_data_avg()`
------------------------------------------------------------------------
##### Retrieving the annual average of daily values for multiple parameters
> The function is capable of processing multiple parameters, and multiple years, but it can only do one averaging type.
> The averaging type can be a simple averaging (e.g., 24-hour, 8-hour, or combined averaging (e.g., annual 98p d1hm, annual mean 24-hour).
> Check the table above for a comprehensive list of averaging_type.
>
> ```{r, eval = FALSE}
> #user can specify the parameter, enter parameter name as input
> annual_mean <- importBC_data_avg(c('pm25','o3'), years = 2015:2018, averaging_type = 'annual mean 24-hr')
> >
> #or if you already have a dataframe, you can use the dataframe as input to for the statistical summary
> df_input <- importBC_data(param = c('pm25','o3'), years = 2015:2018)
> annual_mean <- importBC_data_avg(df_input,averaging_type = 'annual mean 24-hr')
> ```
#### `get_stats()`
------------------------------------------------------------------------
##### Calculate the annual metrics of PM2.5
> The function will calculate statistical summaries , data captures, and number of exceedances based on the CAAQS metrics and values.
> The function will only perform a year-by-year calculation so the results are not the actual CAAQS metrics, but can be used to derive it.
> For ozone, it creates Q2 + Q3
>
> ```{r, eval = FALSE}
> #example retrieves stat summaries
> stats_result <- get_stats(param = 'o3', years = 2016,add_TFEE = TRUE, merge_Stations = TRUE)
> >
> ```
#### `get_captures()`
------------------------------------------------------------------------
##### Calculate the data captures of PM2.5
> The function will create a summary of data captures for the parameter or for dataframe.
> You can specify the parameter or, if available, use an air quality dataframe as input.
>
> ```{r, eval = FALSE}
> #you can use the parameter as input
> data_captures <- get_captures(param = c('pm25','o3'), years = 2015:2018,merge_Stations = TRUE)
> >
> #or you can use a dataframe
> air_data <- importBC_data(c('pm25','o3'), years = 2015:201,merge_Stations = TRUE)
> data_captures <- get_captures(param = air_data, years = 2015:2018)
> ```
#### `listBC_stations()`
------------------------------------------------------------------------
> produces a dataframe that lists all air quality monitoring station details.
> if year is specified, it retrieves the station details from that year.
> Note that this entry may not be accurate since system has not been in place to generate these station details.
```{r, eval = FALSE}
listBC_stations()
listBC_stations(2016)
```
```{r, echo = FALSE,results='hide',message=FALSE,warning=FALSE}
table_ <- envair::listBC_stations()
```
```{r, echo= FALSE}
knitr::kable(table_[1:4,])
```
#### `list_parameters()`
------------------------------------------------------------------------
> produce a vector string of available parameters that can be retrieved with *importBC_data()*
#### `GET_VENTING_ECCC()`
------------------------------------------------------------------------
> produces a dataframe containing the recent venting index.
```{r, eval = FALSE}
GET_VENTING_ECCC()
GET_VENTING_ECCC('2019-11-08')
GET_VENTING_ECCC((dates = seq(from = lubridate::ymd('2021-01-01'),
to = lubridate::ymd('2021-05-01'), by = 'day')))
```
```{r, echo = FALSE,results='hide',message=FALSE,warning=FALSE}
table_ <- envair::GET_VENTING_ECCC()
```
```{r, echo = FALSE}
knitr::kable(table_[1:4,])
```
#### `importECCC_forecast()`
------------------------------------------------------------------------
- Retrieves forecasts and model data from ECCC
- parameters include AQHI, PM25, NO2, O3, PM10
```{r, eval = FALSE}
importECCC_forecast('no2')
```
#### `ventingBC_kml()`
------------------------------------------------------------------------
- creates a kml object based on the 2019 OBSCR rules
- directory to save kml file can be specified. File will be saved in that directory as *Venting_Index_HD.kml*.
```{r, eval = FALSE}
ventingBC_kml()
ventingBC_kml('C:/temp/')
```