-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
130 lines (90 loc) · 7.38 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
---
title: "samplelim : MCMC sampling algorithms for linear inverse models in R"
output: github_document
always_allow_html: true
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE, eval = TRUE)
```
![](https://img.shields.io/badge/lifecycle-maturing-blue.svg)
[![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![License: LGPL v3](https://img.shields.io/badge/License-LGPL%20v3-blue.svg)](https://www.gnu.org/licenses/lgpl-3.0)
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/regexplain)](https://cran.r-project.org/package=samplelim)
The package `{samplelim}` for the R statistical software provides efficient implementations (C++ encoded) of Monte Carlo Markov Chains (MCMC) algorithms for uniformly sampling high-dimensional polytopes.
It is particularly aimed at handling linear inverse models (LIM) in metabolic (trophic, biochemical or urban) networks.
Some support functions are also included to facilitate its use by practitioners.
## Objective
`{samplelim}` provides efficient implementations of two MCMC algorithms for sampling high-dimensional polytopes, the Mirror Walk (MiW) introduced by Van Oevelen _et al._ (2010) and the Billard Walk (BiW) introduced by Polyak and Gryazina (2014).
Thanks to the inclusion of easy to use support functions for linear inverse modeling of metabolic networks, `{samplelim}` can be viewed as an updated, extended and low-level-encoded version of the R package [`{limsolve}`](https://cran.r-project.org/web/packages/limSolve/index.html).
`{samplelim}` is built upon the C++ library [`{volesti}`](https://github.com/GeomScale/volesti): the source code of the R package `{volesti}` 1.1.2-6 has been forked from its [GitHub repository](https://github.com/GeomScale/volesti/releases/tag/v1.1.2-6) as a basis for developing `{samplelim}`.
The C++ library `{volesti}` provides efficient implementations of different MCMC algorithms for sampling high-dimensional polytopes and estimating their volumes. Its R package counterpart proposes a subset of these algorithms among which the BiW.
`{samplelim}` aims at combining the performance of `{volesti}` and the convenient features of `{limsolve}`. Precisely :
* the MiW is implemented. It is an optimized, C++ encoded version of the algorithm coded in pure R programming language in `{limsolve}`;
* `{samplelim}` includes and slightly modifies the BiW from `{volesti}` 1.1.2-6 -- the uniform distribution of the path length in `{volesti}` is replaced by the exponential distribution, as originally suggested by Polyak and Gryazina (2014);
* support functions allowing an easy handling by users are included. These functions are updated/modified versions of the ones present in `{limsolve}`.
## Installation
The package `{samplelim}` is not yet available on CRAN mirrors. Its source code is available on GitHub. It can be installed from its repository by executing the following chunk in any R console.
```{r eval = FALSE}
# Install package remotes if not already installed
if (! "remotes" %in% installed.packages()[,"Package"]) {install.packages("remotes")}
# Install package samplelim from its GitHub repo
remotes::install_github("https://github.com/pregnault/samplelim")
```
## Typical workflow
The workflow of `{samplelim}` is greatly inspired by the one of `{limsolve}`. The main front-end function of `{samplelim}` is `rlim()`, performing uniform sampling in the polytope associated to LIM, by means of MCMC algorithm. Its main, mandatory, argument is `lim`, a list or an object of class `lim` (introduced in `{limsolve}`) encompassing the description of the polytope to be sampled. This list or lim object can be defined manually or, more suitably, from a description file, as illustrated in the following chunk.
```{r readdf}
# Load package
library("samplelim")
# Find path to the example of declaration file (DF) included in samplelim
DF <- system.file("extdata", "DeclarationFileBOWF-short.txt", package = "samplelim")
# Read DF and create a lim object
BOWF <- df2lim(DF)
```
Then, sampling is performed by a simple call to the function `rlim()`, as follows.
```{r}
sample <- rlim(lim = BOWF,
seed = 123, # Set the seed of PRNG
nsamp = 5000) # Number of points in the returned sample
# The points are presented in an N*n matrix, where
# N is the number of sampled points (here, 5000, default = 3000)
# n is the number of flows (ambiant space of the polytope)
dim(sample)
```
Diagnostics on sampling performances can then be performed using package `{coda}`, e.g., the Raftery and Lewis diagnostics, as illustrated below.
```{r}
coda::raftery.diag(data = sample)
```
A complete user guide will soon be included in the package, in the form of a vignette.
A comparison study of computation time and sampling quality, between the implementations of the MiW of `{samplelim}`, the BiW of `{volesti}` (1.1.2-6) and the Coordinate Hit-and-Run with Rounding (CHRR) of the MatLab library `{COBRA}` is available in Girardin _et al._ (2024).
## Credits
The modifications of the core C++ implementation of the BiW and the C++ implementation of the MiW, have been performed by Matthieu DIEN and Théo GRENTE.
The R packaging has been performed by Théo GRENTE and Philippe REGNAULT.
The Declaration File in `inst/extdata` has been produced by Quentin NOGUÈS; see Noguès _at al._ (2020) for details on the ecological network it relies on.
We refer to the [`credits.md` file](https://github.com/GeomScale/volesti/blob/v1.1.1/doc/credits.md) of the R package `{volesti}` for additional information on the development of the core C++ implementation of the BiW (and other algorithms implemented in `{volesti}`).
The function `df2lim()` (reading and formatting a declaration file) is a wrapped copy of functions `Read()` and `Setup()` from package `{LIM}`, developped by Karline SOETAERT. It has been added to the present package to limit its dependency tree.
## Licensing
You may redistribute or modify the software under the [GNU Lesser General Public License](LICENSE.md) as published by Free Software Foundation, either version 3 of the License, or any later version. It is distributed in the hope that it may be helpful to the community, but WITHOUT ANY WARRANTY.
## References
V. Girardin, T. Grente, N. Niquil and P. Regnault,
_Comparing and updating R packages using MCMC Algorithms
for Linear Inverse Modeling of Metabolic Networks_,
[hal-04455831](https://hal.science/hal-04455831)
(2024).
Q. Noguès, A. Raoux, E. Araignous, T. Hattab, B. Leroy, F. Ben Rais Lasram, F. Le Loc’h,
J. Dauvin and N. Niquil,
_Cumulative effects of marine renewable energy and climate change on
ecosystem properties: Sensitivity of ecological network analysis_,
Ecological Indicators __121__, 107128
(2020).
L. Cales, A. Chalkis, I.Z. Emiris and V. Fisikopoulos,
_Practical volume computation of structured convex bodies, and an application to modeling portfolio dependencies and financial crises_,
Proc. of Symposium on Computational Geometry, Budapest, Hungary
(2018).
B.T. Polyak and E.N. Gryazina,
_Billiard walk - a new sampling algorithm for control and optimization_,
IFAC Proceedings Volumes, __47(3)__, 6123-6128
(2014).
D. Van Oevelen, K. Van den Meersche, F. J. R. Meysman, K. Soetaert, J. J. Middelburg and
A. F. Vézina,
_Quantifying Food Web Flows Using Linear Inverse Models_, Ecosystems __13__, 32-45
(2010).