From d59e6e6ab232ef5a0b25b500dee25c7cb76f57b0 Mon Sep 17 00:00:00 2001 From: Luis Valente Date: Fri, 26 Jan 2024 13:31:57 +0100 Subject: [PATCH] trying to fix quarto --- .github/workflows/publish.yml | 2 +- _freeze/DAISIE/execute-results/html.json | 4 +- renv.lock | 56 +++++++++--------------- 3 files changed, 24 insertions(+), 38 deletions(-) diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml index 8bc441c..5aa293b 100644 --- a/.github/workflows/publish.yml +++ b/.github/workflows/publish.yml @@ -19,7 +19,7 @@ jobs: with: use-public-rspm: true - - uses: r-lib/actions/setup-r-dependencies@v2 + - uses: r-lib/actions/setup-renv@v2 with: packages: any::rmarkdown diff --git a/_freeze/DAISIE/execute-results/html.json b/_freeze/DAISIE/execute-results/html.json index e592b85..198ba4c 100644 --- a/_freeze/DAISIE/execute-results/html.json +++ b/_freeze/DAISIE/execute-results/html.json @@ -1,7 +1,7 @@ { - "hash": "5340f2e979342273cbbc0054e4fb2959", + "hash": "15365cd7cbc1c79d7c8f161a68047080", "result": { - "markdown": "---\ntitle: \"DAISIE\"\nauthor: Luis Valente\n---\n\n\n## Fitting DAISIE models using the Galápagos birds as example {#sec-daisie}\n\n![Phylogenies of Galápagos birds](images/Galapagos_picture.png)\n\nIn this part of the practical, we will learn how to use DAISIE. As a demonstraction, we are going to fit the DAISIE model to phylogenetic data for species of native land birds of the Galápagos islands (data from Valente, Phillimore and Etienne 2015 *Ecology Letters*; and Valente et al 2020 *Nature*). We will use software DAISIE to estimate rates of speciation, colonisation and extinction in the Galápagos. We will simulate islands with these estimated rates to see how species diversity has varied through time in the Galápagos.\n\n### Prepare the R environment\n\nEmpty workspace of existing previous objects, just in case.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nrm(list=ls())\n```\n:::\n\n\n### Load the required packages:\n\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(DAISIE)\nlibrary(ape)\n```\n:::\n\n\n### Load and visualise Galapágos bird data\n\nThe [dataset](data/galapagos_datalist.Rdata) of Galápagos birds includes colonisation and branching times for a total of 27 species of terrestrial birds of the Galápagos islands, distributed across 8 lineages. Some of the lineages have only 1 species, e.g. the Galápagos dove *Zenaida*. Others have radiated, including the Darwin's finch radiation (16 species) and the mockingbirds *Mimus* radiation (4 species). **Note**: the dataset we will use is slightly different from the example dataset that is included as data in the DAISIE R package.\n\n#### Load Galápagos DAISIE datalist\n\n**Download the Galápagos dataset file: [galapagos_datalist.Rdata](data/galapagos_datalist.Rdata)**\n\nStore it locally.\n\nLoad it, by seeting the path to the location of the file on your computer:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nload(file=\"PATH_TO_YOUR_DATALIST_FILE\")\n```\n:::\n\n::: {.cell layout-align=\"center\"}\n\n:::\n\n\n#### View Galápagos datalist\n\nJust type:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ngalapagos_datalist\n```\n:::\n\n\n**This datalist is of the same format that DAISIEprep produces (e.g., same format as the `data_list` object that you just produced in the DAISIEprep tutorial).**\n\nView your Galápagos datalist by scrolling up and down in R. Or use function `View()`\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nView(galapagos_datalist)\n```\n:::\n\n\nYou can find detailed information using the R documentation file:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\n? Galapagos_datalist \n```\n:::\n\n\nTo view the top of the list\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ngalapagos_datalist[[1]]\n#> $island_age\n#> [1] 4\n#> \n#> $not_present\n#> [1] 992\n```\n:::\n\n\nThis element contains the island age as well as all the mainland species that do not have extant descendants on the island. These are important pieces of information for DAISIE.\n\nTo view just the *Mimus* colonisation:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ngalapagos_datalist[[4]]\n#> $colonist_name\n#> [1] \"Mimus\"\n#> \n#> $branching_times\n#> [1] 4.00000 3.99999 3.68000 2.93000 0.29000\n#> \n#> $stac\n#> [1] 6\n#> \n#> $missing_species\n#> [1] 0\n#> \n#> $type1or2\n#> [1] 1\n```\n:::\n\n\nThe first element of the `$branching_times` item is the island age, followed by the colonisation time and the branching times (if any). The `$stac` item is a code that tells DAISIE whether the lineage is endemic, non-endemic, and whether or not a precise age or a maximum-age should be used.\n\nOr to view just the Darwin's finches lineage:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ngalapagos_datalist[[3]]\n#> $colonist_name\n#> [1] \"Finches\"\n#> \n#> $branching_times\n#> [1] 4.0000 3.0280 1.3227 0.8223 0.4286 0.3462 0.2450 0.1180 0.0808 0.0756\n#> [11] 0.0527 0.0525 0.0327 0.0322 0.0221 0.0118\n#> \n#> $stac\n#> [1] 2\n#> \n#> $missing_species\n#> [1] 1\n#> \n#> $type1or2\n#> [1] 1\n```\n:::\n\n\nAs you can see, there is 1 missing species in the Darwin's finch lineage - this species was not sampled in the phylogeny, so does not have a corresponding branching time in the `$branching_times` item. However, `$missing_species` item tells DAISIE that it needs to consider one extra species in its computations.\n\n#### Visualise Galápagos data\n\nYou can visualize the data in a different way using:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_island(galapagos_datalist)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-7-1.png){fig-align='center' width=672}\n:::\n:::\n\n\nThe above plot shows the 8 colonisation events (independent lineages), the time of colonisation, number of species sampled in each lineage (n), number of missing species (m) and the times of cladogenetic speciation events (horizontal lines). The dashed horizontal line is the island age. Unfilled symbols indicate that the time of colonisation was set as \"max-age\" (not precise), that is, the colonisation time could have happened any time between the time plotted and the present. This applies for instance to lineages with stem ages older than the island age, whose age is then set to be the age of the island (as was the cases of genus *Mimus*, whose stem age is older than the island age of 4 million years).\n\n#### Plot age versus diversity\n\nTo visualise the diversity of each lineage according to their age:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_age_diversity(galapagos_datalist)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-8-1.png){fig-align='center' width=672}\n:::\n:::\n\n\nThe above plot shows the number of species per lineage plotted against their colonisation time. You can see that one lineage - Darwin's finches - stands out, it is not the oldest lineage, but it has 16 species, many more than the other lineages. One of the lineages, *Mimus*, is not plotted, because it's age is set as max-age, that is why the number of species is 23 (27-4 species=23).\\\n\n\\\n\n### Fitting DAISIE models\n\nWe will now fit three different DAISIE models using maximum likelihood, using the function `DAISIE_ML`. The DAISIE function searches for the parameters that maximise the likelihood of the data given the specified model. We will then compare the likelihoods of the 3 different models to see which one is preferred\n\nWe will fit the 3 different DAISIE models to the phylogenetic data contained in the object `galapagos_datalist`.\n\n\\\n\n#### Fit a full DAISIE model (M1)\n\nThe first model (M1) we will fit is a full DAISIE model with 5 parameters. The parameters in DAISIE are always placed in this order:\n\n1. Cladogenesis rate (lambda_c) (unit: cladogenesis events per island lineage per time unit)\n2. Extinction rate (mu) (unit: extinction events per island lineage per time unit)\n3. Carrying capacity (K) (unit: number of species)\n4. Colonisation rate (gamma) (unit: colonisation events per mainland lineage per time unit)\n5. Anagenesis rate (lambda_a) (unit: anagenesis events per island lineage per time unit)\n\nImportant settings of the `DAISIE_ML` function:\n\n- `initparsopt` refers to the starting parameters of the maximum likelihood optimisation (the search for the optimal parameters). 1.5 is the starting cladogenesis rate, 1.1 is the starting extinction rate, 20 is the initial carrying capacity (20 species), 0.009 is the colonisation rate; 1.1 is the initial anagenesis rate.\n\n- `idparsopt` shows the parameters to optimise, in this case all 5, so 1:5\n\n- `parsfix` and `idparsfix` are set to NULL because we are not fixing any parameters this time.\n\n**The code below takes a while to run.**\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/ML_DAISIE_M1_9eb4703f5545cea53974e8fe74693e00'}\n\n```{.r .cell-code}\nDAISIE_ML(\n datalist = galapagos_datalist,\n initparsopt = c(1.5,1.1,20,0.009,1.1),\n ddmodel = 11,\n idparsopt = 1:5,\n parsfix = NULL,\n idparsfix = NULL\n)\n#> lambda_c mu K gamma lambda_a loglik df conv\n#> 1 1.258226 1.136924 9968506 0.004957378 1.251793e-06 -84.78145 5 0\n```\n:::\n\n\nThe output gives the ML parameters values, the loglikelihood, and the number of parameters (df).\\\n\n#### Fit model with no carrying-capacity (M2)\n\nAs you can see above, the value of K in M1 is very high (hundreds of thousands of species). This suggest that there may be no upper bound to diversity (so K is in fact infinite). So let's try a model without K as a free parameter.\n\nThe second model (M2) is just like M1, but we will fix K to be infinite. In comparison to M1, we now remove the 3rd parameter (carrying capacity) from the `initparsopt` and `idparsopt` settings. So we now have only parameters 1,2,4 and 5. We fix the parameter number 3 to `Inf` (carrying capacity K fixed to infinite):\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/ML_DAISIE_M2_f93d819a0a592bae2d30b65da2b0bb1d'}\n\n```{.r .cell-code}\nDAISIE_ML(\n datalist = galapagos_datalist,\n initparsopt = c(1.5,1.1,0.009,1.1),\n idparsopt = c(1,2,4,5),\n parsfix = Inf,\n idparsfix = 3,\n ddmodel=0\n)\n#> lambda_c mu K gamma lambda_a loglik df conv\n#> 1 1.264389 1.149378 Inf 0.00505558 1.662578e-05 -84.78088 4 0\n```\n:::\n\n\n\\\n\n#### Fit model with no carrying capacity AND no anagenesis (M3)\n\nNow, it appears that the parameter of anagenesis is close to 0 in M2 (see above). Perhaps a model without anagenesis will perform well.\n\nThe third model (M3) is like M2, but with no carrying-capacity and no anagenesis. This time we switch off the 3rd and 5th parameters (carrying-capacity and anagenesis). We fix the 3rd parameter (K) to infinite; and the 5th parameter (anagenesis) to zero:\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/ML_DAISIE_M3_8ab5814e7f8402e370f8cc211d94afb5'}\n\n```{.r .cell-code}\nDAISIE_ML(\n datalist = galapagos_datalist,\n initparsopt = c(1.5,1.1,0.009),\n idparsopt = c(1,2,4),\n parsfix = c(Inf,0),\n idparsfix = c(3,5),\n ddmodel=0\n)\n#> lambda_c mu K gamma lambda_a loglik df conv\n#> 1 1.263034 1.146225 Inf 0.005040353 0 -84.78082 3 0\n```\n:::\n\n\n\\\n\n#### Select the best model using AIC\n\nNow compare the log-likelihoods of the 3 models. They all have similar likelihoods, but one has fewer parameters, so it is preferred. Which model is it?\n\nA more proper way to compare the models is using AIC (or other information criteria), which takes into account the number of parameters and the loglikelihood.\n\nCreate new function to compute AIC values base on the likelihood values and number of parameters for each model.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\n\nAIC_compare<- function(LogLik,k){\n aic <- (2*k)-(2*LogLik)\n return(aic)\n}\n```\n:::\n\n\nFill in the values from your data in this format:\n\n`AIC_compare(c(likelihood_M1,likelihood_M2,likelihood_M3), c(n_parameters_M1,n_parameters_M2,n_parameters_M3))`.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nAICs<-AIC_compare(c(-84.78145 ,-84.78088, -84.78082),c(5,4,3))\n\nnames(AICs)<-c('M1','M2','M3')\nAICs\n#> M1 M2 M3 \n#> 179.5629 177.5618 175.5616\n```\n:::\n\n\nThe model with the **lowest** AIC value is the preferred model. (In this case, M3)\n\n\\\n\n### Simulate islands\n\nNow let's use the ML parameters we estimated from the Galápagos bird data to simulate islands. Choose the parameters of the best model. We will simulate for 4 million years (time = 4). M is the mainland pool (keep it at 1000).\n\n#### Simulate islands with the parameters estimated from the best model for the Galápagos bird data (takes a while to run, you can reduce number of replicates if you want)\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/DAISIE_sim_a72e1df60291d4a58285413d4b8da095'}\n\n```{.r .cell-code}\nGalapagos_sims<-DAISIE_sim(\n time=4,\n M=1000,\n pars=c(1.26, 1.146, Inf, 0.005,0),\n replicates= 100,\n plot_sims = FALSE)\n```\n:::\n\n\n#### Plot the species-through-time plots resulting from the simulations\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_sims(Galapagos_sims)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-9-1.png){fig-align='center' width=672}\n:::\n:::\n\n\n#### Play with different simulations settings\n\nPlay with different parameter settings in `DAISIE_sim` (e.g. changing extinction rate, colonisation rate, island age, mainland pool size, replicates). Is equilibrium reached?\n\n\\\n\n#### Comparison with bird data from the Azores islands\n\nNow let's brielfy compare with simulatiosn for a completely different dataset, of birds from the Azores archipelago (Atlantic Ocean). Load datatable containing Azores bird data from Valente et al 2017 (*Current Biology*)\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ndata(Macaronesia_datalist)\nAzores<-Macaronesia_datalist[[1]]\n```\n:::\n\n\n##### Visualise Azores data\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_island(Azores)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-11-1.png){fig-align='center' width=672}\n:::\n:::\n\n\nThree of the species are extinct and only known from fossils.\n\n##### Simulate Azores with pre-identified ML parameters\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/DAISIE_sim_Azores_b105bcf16127a120f9ce0641c19374c9'}\n\n```{.r .cell-code}\nAzores_sims<-DAISIE_sim(\n time=6.3,\n M=300,\n pars=c(0,1.053151832,Inf,0.052148979,0.512939011),\n replicates= 100,\n plot_sims=FALSE)\n```\n:::\n\n\n\\\n\n##### Plot the species-through-time plot for the Azores from the resulting from the simulations\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_sims(Azores_sims)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-12-1.png){fig-align='center' width=672}\n:::\n:::\n\n\nHow does the Azores plot differ from that of the Galápagos?\n\n\\\nThe end of the DAISIE practical. Well done! Now you are ready to work on the *Insula* exercise\n", + "markdown": "---\ntitle: \"DAISIE\"\nauthor: Luis Valente\n---\n\n\n## Fitting DAISIE models using the Galápagos birds as example {#sec-daisie}\n\n![Phylogenies of Galápagos birds](images/Galapagos_picture.png)\n\nIn this part of the practical, we will learn how to use DAISIE. As a demonstraction, we are going to fit the DAISIE model to phylogenetic data for species of native land birds of the Galápagos islands (data from Valente, Phillimore and Etienne 2015 *Ecology Letters*; and Valente et al 2020 *Nature*). We will use software DAISIE to estimate rates of speciation, colonisation and extinction in the Galápagos. We will simulate islands with these estimated rates to see how species diversity has varied through time in the Galápagos.\n\n### Prepare the R environment\n\nEmpty workspace of existing previous objects, just in case.\n\n\n::: {.cell}\n\n```{.r .cell-code}\nrm(list=ls())\n```\n:::\n\n\n### Load the required packages:\n\n\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(DAISIE)\nlibrary(ape)\n```\n:::\n\n\n### Load and visualise Galapágos bird data\n\nThe [dataset](data/galapagos_datalist.Rdata) of Galápagos birds includes colonisation and branching times for a total of 27 species of terrestrial birds of the Galápagos islands, distributed across 8 lineages. Some of the lineages have only 1 species, e.g. the Galápagos dove *Zenaida*. Others have radiated, including the Darwin's finch radiation (16 species) and the mockingbirds *Mimus* radiation (4 species). **Note**: the dataset we will use is slightly different from the example dataset that is included as data in the DAISIE R package.\n\n#### Load Galápagos DAISIE datalist\n\n**Download the Galápagos dataset file: [galapagos_datalist.Rdata](data/galapagos_datalist.Rdata)**\n\nStore it locally.\n\nLoad it into R, by setting the path to the location of the file on your computer:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nload(file=\"PATH_TO_YOUR_DATALIST_FILE\")\n```\n:::\n\n::: {.cell layout-align=\"center\"}\n\n:::\n\n\n#### View Galápagos datalist\n\nJust type:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ngalapagos_datalist\n```\n:::\n\n\n**This datalist is of the same format that DAISIEprep produces (e.g., same format as the `data_list` object that you just produced in the DAISIEprep tutorial).**\n\nView your Galápagos datalist by scrolling up and down in R. Or use function `View()`\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nView(galapagos_datalist)\n```\n:::\n\n\nYou can find detailed information using the R documentation file:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\n? Galapagos_datalist \n```\n:::\n\n\nTo view the top of the list\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ngalapagos_datalist[[1]]\n#> $island_age\n#> [1] 4\n#> \n#> $not_present\n#> [1] 992\n```\n:::\n\n\nThis element contains the island age as well as all the mainland species that do not have extant descendants on the island. These are important pieces of information for DAISIE.\n\nTo view just the *Mimus* colonisation:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ngalapagos_datalist[[4]]\n#> $colonist_name\n#> [1] \"Mimus\"\n#> \n#> $branching_times\n#> [1] 4.00000 3.99999 3.68000 2.93000 0.29000\n#> \n#> $stac\n#> [1] 6\n#> \n#> $missing_species\n#> [1] 0\n#> \n#> $type1or2\n#> [1] 1\n```\n:::\n\n\nThe first element of the `$branching_times` item is the island age, followed by the colonisation time and the branching times (if any). The `$stac` item is a code that tells DAISIE whether the lineage is endemic, non-endemic, and whether or not a precise age or a maximum-age should be used.\n\nOr to view just the Darwin's finches lineage:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ngalapagos_datalist[[3]]\n#> $colonist_name\n#> [1] \"Finches\"\n#> \n#> $branching_times\n#> [1] 4.0000 3.0280 1.3227 0.8223 0.4286 0.3462 0.2450 0.1180 0.0808 0.0756\n#> [11] 0.0527 0.0525 0.0327 0.0322 0.0221 0.0118\n#> \n#> $stac\n#> [1] 2\n#> \n#> $missing_species\n#> [1] 1\n#> \n#> $type1or2\n#> [1] 1\n```\n:::\n\n\nAs you can see, there is 1 missing species in the Darwin's finch lineage - this species was not sampled in the phylogeny, so does not have a corresponding branching time in the `$branching_times` item. However, `$missing_species` item tells DAISIE that it needs to consider one extra species in its computations.\n\n#### Visualise Galápagos data\n\nYou can visualize the data in a different way using:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_island(galapagos_datalist)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-7-1.png){fig-align='center' width=672}\n:::\n:::\n\n\nThe above plot shows the 8 colonisation events (independent lineages), the time of colonisation, number of species sampled in each lineage (n), number of missing species (m) and the times of cladogenetic speciation events (horizontal lines). The dashed horizontal line is the island age. Unfilled symbols indicate that the time of colonisation was set as \"max-age\" (not precise), that is, the colonisation time could have happened any time between the time plotted and the present. This applies for instance to lineages with stem ages older than the island age, whose age is then set to be the age of the island (as was the cases of genus *Mimus*, whose stem age is older than the island age of 4 million years).\n\n#### Plot age versus diversity\n\nTo visualise the diversity of each lineage according to their age:\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_age_diversity(galapagos_datalist)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-8-1.png){fig-align='center' width=672}\n:::\n:::\n\n\nThe above plot shows the number of species per lineage plotted against their colonisation time. You can see that one lineage - Darwin's finches - stands out, it is not the oldest lineage, but it has 16 species, many more than the other lineages. One of the lineages, *Mimus*, is not plotted, because it's age is set as max-age, that is why the number of species is 23 (27-4 species=23).\\\n\n\\\n\n### Fitting DAISIE models\n\nWe will now fit three different DAISIE models using maximum likelihood, using the function `DAISIE_ML`. The DAISIE function searches for the parameters that maximise the likelihood of the data given the specified model. We will then compare the likelihoods of the 3 different models to see which one is preferred\n\nWe will fit the 3 different DAISIE models to the phylogenetic data contained in the object `galapagos_datalist`.\n\n\\\n\n#### Fit a full DAISIE model (M1)\n\nThe first model (M1) we will fit is a full DAISIE model with 5 parameters. The parameters in DAISIE are always placed in this order:\n\n1. Cladogenesis rate (lambda_c) (unit: cladogenesis events per island lineage per time unit)\n2. Extinction rate (mu) (unit: extinction events per island lineage per time unit)\n3. Carrying capacity (K) (unit: number of species)\n4. Colonisation rate (gamma) (unit: colonisation events per mainland lineage per time unit)\n5. Anagenesis rate (lambda_a) (unit: anagenesis events per island lineage per time unit)\n\nImportant settings of the `DAISIE_ML` function:\n\n- `initparsopt` refers to the starting parameters of the maximum likelihood optimisation (the search for the optimal parameters). 1.5 is the starting cladogenesis rate, 1.1 is the starting extinction rate, 20 is the initial carrying capacity (20 species), 0.009 is the colonisation rate; 1.1 is the initial anagenesis rate.\n\n- `idparsopt` shows the parameters to optimise, in this case all 5, so 1:5\n\n- `parsfix` and `idparsfix` are set to NULL because we are not fixing any parameters this time.\n\n**The code below takes a while to run.**\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/ML_DAISIE_M1_9eb4703f5545cea53974e8fe74693e00'}\n\n```{.r .cell-code}\nDAISIE_ML(\n datalist = galapagos_datalist,\n initparsopt = c(1.5,1.1,20,0.009,1.1),\n ddmodel = 11,\n idparsopt = 1:5,\n parsfix = NULL,\n idparsfix = NULL\n)\n#> lambda_c mu K gamma lambda_a loglik df conv\n#> 1 1.258226 1.136924 9968506 0.004957378 1.251793e-06 -84.78145 5 0\n```\n:::\n\n\nThe output gives the ML parameters values, the loglikelihood, and the number of parameters (df).\\\n\n#### Fit model with no carrying-capacity (M2)\n\nAs you can see above, the value of K in M1 is very high (hundreds of thousands of species). This suggest that there may be no upper bound to diversity (so K is in fact infinite). So let's try a model without K as a free parameter.\n\nThe second model (M2) is just like M1, but we will fix K to be infinite. In comparison to M1, we now remove the 3rd parameter (carrying capacity) from the `initparsopt` and `idparsopt` settings. So we now have only parameters 1,2,4 and 5. We fix the parameter number 3 to `Inf` (carrying capacity K fixed to infinite):\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/ML_DAISIE_M2_f93d819a0a592bae2d30b65da2b0bb1d'}\n\n```{.r .cell-code}\nDAISIE_ML(\n datalist = galapagos_datalist,\n initparsopt = c(1.5,1.1,0.009,1.1),\n idparsopt = c(1,2,4,5),\n parsfix = Inf,\n idparsfix = 3,\n ddmodel=0\n)\n#> lambda_c mu K gamma lambda_a loglik df conv\n#> 1 1.264389 1.149378 Inf 0.00505558 1.662578e-05 -84.78088 4 0\n```\n:::\n\n\n\\\n\n#### Fit model with no carrying capacity AND no anagenesis (M3)\n\nNow, it appears that the parameter of anagenesis is close to 0 in M2 (see above). Perhaps a model without anagenesis will perform well.\n\nThe third model (M3) is like M2, but with no carrying-capacity and no anagenesis. This time we switch off the 3rd and 5th parameters (carrying-capacity and anagenesis). We fix the 3rd parameter (K) to infinite; and the 5th parameter (anagenesis) to zero:\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/ML_DAISIE_M3_8ab5814e7f8402e370f8cc211d94afb5'}\n\n```{.r .cell-code}\nDAISIE_ML(\n datalist = galapagos_datalist,\n initparsopt = c(1.5,1.1,0.009),\n idparsopt = c(1,2,4),\n parsfix = c(Inf,0),\n idparsfix = c(3,5),\n ddmodel=0\n)\n#> lambda_c mu K gamma lambda_a loglik df conv\n#> 1 1.263034 1.146225 Inf 0.005040353 0 -84.78082 3 0\n```\n:::\n\n\n\\\n\n#### Select the best model using AIC\n\nNow compare the log-likelihoods of the 3 models. They all have similar likelihoods, but one has fewer parameters, so it is preferred. Which model is it?\n\nA more proper way to compare the models is using AIC (or other information criteria), which takes into account the number of parameters and the loglikelihood.\n\nCreate new function to compute AIC values base on the likelihood values and number of parameters for each model.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\n\nAIC_compare<- function(LogLik,k){\n aic <- (2*k)-(2*LogLik)\n return(aic)\n}\n```\n:::\n\n\nFill in the values from your data in this format:\n\n`AIC_compare(c(likelihood_M1,likelihood_M2,likelihood_M3), c(n_parameters_M1,n_parameters_M2,n_parameters_M3))`.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nAICs<-AIC_compare(c(-84.78145 ,-84.78088, -84.78082),c(5,4,3))\n\nnames(AICs)<-c('M1','M2','M3')\nAICs\n#> M1 M2 M3 \n#> 179.5629 177.5618 175.5616\n```\n:::\n\n\nThe model with the **lowest** AIC value is the preferred model. (In this case, M3)\n\n\\\n\n### Simulate islands\n\nNow let's use the ML parameters we estimated from the Galápagos bird data to simulate islands. Choose the parameters of the best model. We will simulate for 4 million years (time = 4). M is the mainland pool (keep it at 1000).\n\n#### Simulate islands with the parameters estimated from the best model for the Galápagos bird data (takes a while to run, you can reduce number of replicates if you want)\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/DAISIE_sim_a72e1df60291d4a58285413d4b8da095'}\n\n```{.r .cell-code}\nGalapagos_sims<-DAISIE_sim(\n time=4,\n M=1000,\n pars=c(1.26, 1.146, Inf, 0.005,0),\n replicates= 100,\n plot_sims = FALSE)\n```\n:::\n\n\n#### Plot the species-through-time plots resulting from the simulations\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_sims(Galapagos_sims)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-9-1.png){fig-align='center' width=672}\n:::\n:::\n\n\n#### Play with different simulations settings\n\nPlay with different parameter settings in `DAISIE_sim` (e.g. changing extinction rate, colonisation rate, island age, mainland pool size, replicates). Is equilibrium reached?\n\n\\\n\n#### Comparison with bird data from the Azores islands\n\nNow let's brielfy compare with simulatiosn for a completely different dataset, of birds from the Azores archipelago (Atlantic Ocean). Load datatable containing Azores bird data from Valente et al 2017 (*Current Biology*)\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\ndata(Macaronesia_datalist)\nAzores<-Macaronesia_datalist[[1]]\n```\n:::\n\n\n##### Visualise Azores data\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_island(Azores)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-11-1.png){fig-align='center' width=672}\n:::\n:::\n\n\nThree of the species are extinct and only known from fossils.\n\n##### Simulate Azores with pre-identified ML parameters\n\n\n::: {.cell layout-align=\"center\" hash='DAISIE_cache/html/DAISIE_sim_Azores_b105bcf16127a120f9ce0641c19374c9'}\n\n```{.r .cell-code}\nAzores_sims<-DAISIE_sim(\n time=6.3,\n M=300,\n pars=c(0,1.053151832,Inf,0.052148979,0.512939011),\n replicates= 100,\n plot_sims=FALSE)\n```\n:::\n\n\n\\\n\n##### Plot the species-through-time plot for the Azores from the resulting from the simulations\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nDAISIE_plot_sims(Azores_sims)\n```\n\n::: {.cell-output-display}\n![](DAISIE_files/figure-html/unnamed-chunk-12-1.png){fig-align='center' width=672}\n:::\n:::\n\n\nHow does the Azores plot differ from that of the Galápagos?\n\n\\\nThe end of the DAISIE practical. Well done! Now you are ready to work on the *Insula* exercise\n", "supporting": [], "filters": [ "rmarkdown/pagebreak.lua" diff --git a/renv.lock b/renv.lock index 0285fd3..f534708 100644 --- a/renv.lock +++ b/renv.lock @@ -1,7 +1,27 @@ { "R": { - "Version": "4.2.3", + "Version": "4.2.1", "Repositories": [ + { + "Name": "BioCsoft", + "URL": "https://bioconductor.org/packages/3.16/bioc" + }, + { + "Name": "BioCann", + "URL": "https://bioconductor.org/packages/3.16/data/annotation" + }, + { + "Name": "BioCexp", + "URL": "https://bioconductor.org/packages/3.16/data/experiment" + }, + { + "Name": "BioCworkflows", + "URL": "https://bioconductor.org/packages/3.16/workflows" + }, + { + "Name": "BioCbooks", + "URL": "https://bioconductor.org/packages/3.16/books" + }, { "Name": "CRAN", "URL": "https://packagemanager.posit.co/cran/latest" @@ -713,28 +733,6 @@ ], "Hash": "4657d320971a330ecffd0e1260f58214" }, - "ggimage": { - "Package": "ggimage", - "Version": "0.3.3", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "R", - "ggfun", - "ggplot2", - "ggplotify", - "grid", - "jsonlite", - "magick", - "methods", - "scales", - "tibble", - "tools", - "utils", - "withr" - ], - "Hash": "18c57347f13a654ef8683a6676042624" - }, "ggplot2": { "Package": "ggplot2", "Version": "3.4.3", @@ -1027,18 +1025,6 @@ ], "Hash": "001cecbeac1cff9301bdc3775ee46a86" }, - "magick": { - "Package": "magick", - "Version": "2.8.0", - "Source": "Repository", - "Repository": "CRAN", - "Requirements": [ - "Rcpp", - "curl", - "magrittr" - ], - "Hash": "88a1be45614781633df69ab666d66143" - }, "magrittr": { "Package": "magrittr", "Version": "2.0.3",