-
Notifications
You must be signed in to change notification settings - Fork 1
/
Lesson5.Rmd
223 lines (180 loc) · 7.83 KB
/
Lesson5.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
---
params:
lesson: "Lesson 5"
title: "Turning code into reproducible documents with `Rmarkdown`"
functions: "`here`, `summarise`, `set_here`, `dr_here`"
packages: "`here`, `dplyr`, `readr`,`rmarkdown`"
# end inputs ---------------------------------------------------------------
header-includes: \usepackage{float}
always_allow_html: yes
output:
html_document:
code_folding: show
---
```{r, setup, echo = FALSE, cache = FALSE, include = FALSE}
options(width=100)
knitr::opts_chunk$set(
eval = FALSE, # run all code
echo = TRUE, # show code chunks in output
tidy = TRUE, # make output as tidy
message = FALSE, # mask all messages
warning = FALSE, # mask all warnings
comment = "",
tidy.opts=list(width.cutoff=100), # set width of code chunks in output
size="small" # set code chunk size
)
```
\
<!-- install packages -->
```{r, load packages, eval=T, include=T, cache=F, message=F, warning=F, results='hide',echo=F}
packages <- c("ggplot2","ggthemes","dplyr","tidyverse","zoo","RColorBrewer","viridis","plyr")
if (require(packages)) {
install.packages(packages,dependencies = T)
require(packages)
# load tvthemes
devtools::install_github("Ryo-N7/tvthemes")
}
lapply(packages,library,character.only=T)
```
<!-- ____________________________________________________________________________ -->
<!-- ____________________________________________________________________________ -->
<!-- ____________________________________________________________________________ -->
<!-- start body -->
# `r paste0(params$lesson,": ",params$title)`
\
Functions for `r params$lesson`
`r params$functions`
\
Packages for `r params$lesson`
`r params$packages`
\
# Agenda
Create reproducible HTML, PDF, and Word documents from R using `RMarkdown`.
* **[Download the `RMarkdown` template (right click > Save As)](https://github.com/darwinanddavis/EmoRyCodingClub/raw/master/Lesson5_rmd.Rmd).** Save this in the same folder as your coding club files. The file should have a *.Rmd* extension, e.g. **Lesson5_rmd.Rmd**.
* Open `RStudio` and run the following code to install the necessary packages:
```{r, eval=F}
packages <- c("pacman","rmarkdown", "here")
install.packages(packages,dependencies = T)
lapply(packages,require,character.only=T)
```
\
<!-- end yaml template------------------------------------------------------- -->
# Do First
Recreate the below plot using the larger NYC Airbnb dataset. Here's the code to add the title and labels.
```{r,eval=F}
url <- "http://data.insideairbnb.com/united-states/ny/new-york-city/2021-04-07/data/listings.csv.gz"
nyc_full <- readr::read_csv(url) # reads in data
ttl <- "NYC Airbnb availability"
subttl <- "for room types across neighbourhoods"
xlab <- "Availability of property in next 30 days"
ylab <- "Count"
# append this to your ggplot function with a `+`
labs(title = ttl,
subtitle = subttl,
x = xlab,
y = ylab)
```
```{r,echo=F,eval=T,cache=T}
url <- "http://data.insideairbnb.com/united-states/ny/new-york-city/2021-04-07/data/listings.csv.gz"
nyc_full <- readr::read_csv(url) # reads in data
```
```{r, echo=F, eval=T, out.width="100%"}
ttl <- "NYC Airbnb availability"
subttl <- "for room types across neighbourhoods"
xlab <- "Availability of property in next 30 days"
ylab <- "Count"
my_theme <- theme_tufte()
names(nyc_full)
# glimpse(nyc_full)
nyc_full_ <- nyc_full %>% dplyr::filter(availability_30 > 0)
ggplot() +
geom_bar(data = nyc_full_,
mapping = aes(x = availability_30,
color=neighbourhood_group_cleansed,
fill=neighbourhood_group_cleansed), show.legend = F) +
facet_grid(room_type ~ neighbourhood_group_cleansed) +
labs(title = ttl,
subtitle = subttl,
x = xlab,
y = ylab) +
my_theme
```
# Setting your working directory
Using `here` to set your working directory
```{r}
require(here) # load the here package if not already
set_here() # set current working directory
dr_here() # print current working directory where .here is located
```
**Example 1**
1. Create a folder in your local directory called **'lookhere'**.
2. Open a new file in TextEdit (Mac) or Notepad (Windows), type in something like " _We found the here file!_ ", then save the file as **"heretest.txt"**.
```{r}
require(readr)
# example 1
list.files() # print files in your current working dir. the 'heretest.txt' file is not here, but one folder below in the new 'lookhere' folder
here("lookhere","heretest.txt") # using here to forage for the file
here("lookhere","heretest.txt") %>% read_lines # print the contents in R
```
\
**Example 2**
1. Create another folder within the **lookhere** folder called **lookheretoo**
2. Create another .txt file called **heretesttoo.txt** and save it.
```{r}
# example 2
# create two folders
folder1 <- "lookhere"
folder2 <- "lookheretoo"
file <- "heretesttoo.txt"
# navigate to the /lookheretoo folder and open the heretesttoo file using 'here'
here(folder1,folder2,file) %>% read_lines
read_lines(here(folder1,folder2,file)) # non-tidy version
```
\
Why is `here` useful?
* You can step through sub folders by defining them individually as function inputs
* You can user define these subfolders as variables at the beginning of your `R` script and refer to them throughout your script without pasting, e.g. `paste(folder1,folder2,sep="/")`.
# Creating an `R` project (`Rproj`) file
**Option 1:** Create an RProject new directory:
* _File > New Project_
* _Create New Project_
* Choose a name for your RProj folder. Fill out _Directory name:_ (make it machine-friendly, i.e. no spaces)
* Choose a place for the RProj to live. _Browse_
* Select _Open in new session_
\
**Option 2:** If you already have a folder *just* for Emory Coding Club:
* _File > New Project_
* _Existing directory > Browse_
* Select _Open in new session_
\
* Open the 'Lesson5_rmd.Rmd' file within this project and save it as 'Lesson5_rmd.Rmd'.
* Open another _.R_ file within this project and save it.
* Now repeat the above steps with a second _Rproj_ file. Save this in another directory somewhere on your computer.
* Now check the working directory within both _Rproj_ files using the `here` package. What do you see?
# Navigating the `Rmarkdown` interface
Have a look through each section of the raw `.Rmd` file to see what the code does. You'll be using this template to create your own files, such as creating personal notes for future sessions in this coding club.
<!-- ----------------------- image --------------------------- -->
<div align="center"; text-align:center>
<img src="img/rmd.jpg" style=width:100%>
</div>
<!-- ----------------------- image --------------------------- -->
# Exercises
* Knit an HTML document. Use the hotkey `Cmd/Ctrl + Shift + K`
* Create a 'EmoRy Coding Club notes' document
* Create your own parameter in the yaml
* Create a code chunk that shows the code and output, but masks any warnings
* Create a code chunk with a plot of your own data or from the Airbnb data. Here's the NYC Airbnb data:
```{r}
# large Airbnb dataset (106 cols)
require(readr,dplyr)
url <- "http://data.insideairbnb.com/united-states/ny/new-york-city/2021-04-07/data/listings.csv.gz"
nyc_full <- read_csv(url) # reads in data
nyc_full %>% glimpse
```
* Change the above figure dimensions in the knitted document
* Create a table with the title in bold, one of the columns set to right justified text, and one of the rows italicised
* Write an inline equation and an indented equation
<!-- end body -->
<!-- ____________________________________________________________________________ -->
<!-- ____________________________________________________________________________ -->
<!-- ____________________________________________________________________________ -->