Skip to content

Commit

Permalink
figure sizes
Browse files Browse the repository at this point in the history
  • Loading branch information
dicook committed May 3, 2024
1 parent 9420ef9 commit 7bf40f2
Show file tree
Hide file tree
Showing 6 changed files with 31 additions and 19 deletions.
4 changes: 2 additions & 2 deletions docs/search.json

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions docs/sitemap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -110,11 +110,11 @@
</url>
<url>
<loc>https://iml.numbat.space/week10/tutorialsol.html</loc>
<lastmod>2024-05-03T00:04:26.272Z</lastmod>
<lastmod>2024-05-03T00:12:27.655Z</lastmod>
</url>
<url>
<loc>https://iml.numbat.space/week10/tutorial.html</loc>
<lastmod>2024-05-03T00:04:26.272Z</lastmod>
<lastmod>2024-05-03T00:12:27.655Z</lastmod>
</url>
<url>
<loc>https://iml.numbat.space/week12/index.html</loc>
Expand Down
8 changes: 4 additions & 4 deletions docs/week10/tutorial.html
Original file line number Diff line number Diff line change
Expand Up @@ -339,7 +339,7 @@ <h4 class="anchored" data-anchor-id="how-would-you-cluster-this-data">1. How wou
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="tutorial_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:100.0%"></p>
<p><img src="tutorial_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:60.0%"></p>
</figure>
</div>
</div>
Expand Down Expand Up @@ -388,7 +388,7 @@ <h4 class="anchored" data-anchor-id="clustering-spotify-data-with-k-means">2. Cl
<div class="cell-output-display">
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="tutorial_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:100.0%"></p>
<p><img src="tutorial_files/figure-html/unnamed-chunk-8-1.png" class="img-fluid quarto-figure quarto-figure-center figure-img" style="width:60.0%"></p>
</figure>
</div>
</div>
Expand All @@ -400,8 +400,8 @@ <h4 class="anchored" data-anchor-id="clustering-spotify-data-with-k-means">2. Cl
<li>Divide the data into 11 clusters, and examine the number of songs in each. Using plotly, mouse over the resulting plot and explore songs belonging to a cluster. (I don’t know much about these songs, but if you are a music fan maybe discussing with other class members and your tutor about the groupings, like which ones are grouped in clusters with high <code>liveness</code>, high <code>tempo</code> or <code>danceability</code> could be fun.)</li>
</ol>
</section>
<section id="clustering-several-simulated-data-sets-with-know-cluster-structure" class="level4">
<h4 class="anchored" data-anchor-id="clustering-several-simulated-data-sets-with-know-cluster-structure">3. Clustering several simulated data sets with know cluster structure</h4>
<section id="clustering-several-simulated-data-sets-with-known-cluster-structure" class="level4">
<h4 class="anchored" data-anchor-id="clustering-several-simulated-data-sets-with-known-cluster-structure">3. Clustering several simulated data sets with known cluster structure</h4>
<ol type="a">
<li>In tutorial of week 3 you used the tour to visualise the data sets <code>c1</code> and <code>c3</code> provided with the <code>mulgar</code> package. Review what you said about the structure in these data sets, and write down your expectations for how a cluster analysis would divide the data.</li>
</ol>
Expand Down
8 changes: 7 additions & 1 deletion docs/week10/tutorial.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ a. How would you cluster this data?

```{r}
#| echo: false
#| out-width: 60%
set.seed(840)
challenge <- sphere.hollow(p=2, n=200)$points |> as_tibble()
challenge[1:100, ] <- challenge[1:100, ] * 2.2
Expand All @@ -110,6 +111,7 @@ My distance will use the radial distance from $(0, 0)$. You first convert each p


```{r}
#| out-width: 60%
mydist <- function(x1, x2) {
d <- abs(sqrt(sum(x1^2)) - sqrt(sum(x2^2)))
return(d)
Expand Down Expand Up @@ -154,6 +156,7 @@ Many of the variables have skewed distributions. For cluster analysis, why might
::: unilur-solution

```{r}
#| out-width: 60%
p <- spotify_std |>
pivot_longer(danceability:artist_popularity,
names_to="var", values_to="value") |>
Expand All @@ -173,6 +176,7 @@ Yes, for example "Free Bird" and "Sparkle" could be found by simply examining a
b. Transform the skewed variables to be as symmetric as possible, and then fit a $k=3$-means clustering. Extract and report these metrics: `totss`, `tot.withinss`, `betweenss`. What is the ratio of within to between SS?

```{r}
#| out-width: 60%
# Transforming some variables: imperfect
spotify_tf <- spotify |>
mutate(speechiness = log10(speechiness),
Expand All @@ -196,6 +200,7 @@ spotify_tf |>
::: unilur-solution

```{r}
#| out-width: 60%
# Check that it clusters
set.seed(131)
spotify_km3 <- kmeans(spotify_tf[,5:15], 3)
Expand All @@ -220,6 +225,7 @@ c. Now the algorithm $k=1, ..., 20$. Extract the metrics, and plot the ratio of
::: unilur-solution

```{r}
#| out-width: 60%
# Run many k
spotify_km <-
tibble(k = 1:20) %>%
Expand Down Expand Up @@ -267,7 +273,7 @@ spotify_assign_df |>
```
:::

#### 3. Clustering several simulated data sets with know cluster structure
#### 3. Clustering several simulated data sets with known cluster structure

a. In tutorial of week 3 you used the tour to visualise the data sets `c1` and `c3` provided with the `mulgar` package. Review what you said about the structure in these data sets, and write down your expectations for how a cluster analysis would divide the data.

Expand Down
18 changes: 9 additions & 9 deletions docs/week10/tutorialsol.html

Large diffs are not rendered by default.

8 changes: 7 additions & 1 deletion week10/tutorial.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ a. How would you cluster this data?

```{r}
#| echo: false
#| out-width: 60%
set.seed(840)
challenge <- sphere.hollow(p=2, n=200)$points |> as_tibble()
challenge[1:100, ] <- challenge[1:100, ] * 2.2
Expand All @@ -110,6 +111,7 @@ My distance will use the radial distance from $(0, 0)$. You first convert each p


```{r}
#| out-width: 60%
mydist <- function(x1, x2) {
d <- abs(sqrt(sum(x1^2)) - sqrt(sum(x2^2)))
return(d)
Expand Down Expand Up @@ -154,6 +156,7 @@ Many of the variables have skewed distributions. For cluster analysis, why might
::: unilur-solution

```{r}
#| out-width: 60%
p <- spotify_std |>
pivot_longer(danceability:artist_popularity,
names_to="var", values_to="value") |>
Expand All @@ -173,6 +176,7 @@ Yes, for example "Free Bird" and "Sparkle" could be found by simply examining a
b. Transform the skewed variables to be as symmetric as possible, and then fit a $k=3$-means clustering. Extract and report these metrics: `totss`, `tot.withinss`, `betweenss`. What is the ratio of within to between SS?

```{r}
#| out-width: 60%
# Transforming some variables: imperfect
spotify_tf <- spotify |>
mutate(speechiness = log10(speechiness),
Expand All @@ -196,6 +200,7 @@ spotify_tf |>
::: unilur-solution

```{r}
#| out-width: 60%
# Check that it clusters
set.seed(131)
spotify_km3 <- kmeans(spotify_tf[,5:15], 3)
Expand All @@ -220,6 +225,7 @@ c. Now the algorithm $k=1, ..., 20$. Extract the metrics, and plot the ratio of
::: unilur-solution

```{r}
#| out-width: 60%
# Run many k
spotify_km <-
tibble(k = 1:20) %>%
Expand Down Expand Up @@ -267,7 +273,7 @@ spotify_assign_df |>
```
:::

#### 3. Clustering several simulated data sets with know cluster structure
#### 3. Clustering several simulated data sets with known cluster structure

a. In tutorial of week 3 you used the tour to visualise the data sets `c1` and `c3` provided with the `mulgar` package. Review what you said about the structure in these data sets, and write down your expectations for how a cluster analysis would divide the data.

Expand Down

0 comments on commit 7bf40f2

Please sign in to comment.