Met4j command-line toolbox for metabolic networks
git clone https://forgemia.inra.fr/metexplore/met4j.git;
cd met4j;
mvn clean install
cd met4j-toolbox
mvn clean package
The executable jar is downloadable in the met4j gitlab registry.
The toolbox can be launched using
java -jar target/met4j-toolbox-<version>.jar
which will list all the contained applications that can be called using
java -cp target/met4j-toolbox-<version>.jar <Package>.<App name> -h
Log4j from jsbml can be very verbose. You can make it silent by adding this command :
java -Dlog4j.configuration= -cp target/met4j-toolbox-<version>.jar ...
You need at least singularity v3.5.
singularity pull met4j-toolbox.sif oras://registry.forgemia.inra.fr/metexplore/met4j/met4j-singularity:latest
If you want a specific version:
singularity pull met4j-toolbox.sif oras://registry.forgemia.inra.fr/metexplore/met4j/met4j-singularity:x.y.z
If you want the last develop version:
singularity pull met4j-toolbox.sif oras://registry.forgemia.inra.fr/metexplore/met4j/met4j-singularity:develop
If you want to build by yourself the singularity image:
cd met4j-toolbox
mvn package
cd ../
singularity build met4j-toolbox.sif met4j.singularity
This will download a singularity container met4j-toolbox.sif that you can directly launch.
To list all the apps.
met4j-toolbox.sif
To launch a specific app, prefix its name with the last component of its package name. For instance:
met4j-toolbox.sif convert.Tab2Sbml -h -in fic.tsv -sbml fic.sbml
By default, singularity does not see the directories that are not descendants of your home directory. To get the directories outside your home directory, you have to specify the SINGULARITY_BIND environment variable. At least, to get the data in the default reference directory, you have to specify: In bash:
export SINGULARITY_BIND=/db
In csh or in tcsh
setenv SINGULARITY_BIND /db
First install Docker.
Pull the latest met4j image:
sudo docker pull metexplore/met4j:latest
If you want a specific version:
sudo docker pull metexplore/met4j:x.y.z
If you want the develop version:
sudo docker pull metexplore/met4j:develop
If you want to build by yourself the docker image:
cd met4j-toolbox
mvn package
cd ../
sudo docker build -t metexplore/met4j:myversion .
To list all the apps:
sudo docker run metexplore/met4j:latest met4j.sh
Don't forget to map volumes when you want to process local files. Example:
sudo docker run -v /home/lcottret/work:/work \
metexplore/met4j:latest met4j.sh convert.Sbml2Tab \
-in /work/toy_model.xml -out /work/toy_model.tsv
If you change the working directory, you have to specify "sh /usr/bin/met4j.sh":
sudo docker run -w /work -v /home/lcottret/work:/work \
metexplore/met4j:latest sh /usr/bin/met4j.sh convert.Sbml2Tab \
-in toy_model.xml -out toy_model.tsv
Galaxy wrappers for met4j-toolbox apps are available in the Galaxy toolshed (master version) and in the Galaxy test toolsdhed (develop version). Wrappers launch the met4j singularity container, so the server where your Galaxy instance is hosted must have Singularity installed.
Package fr.inrae.toulouse.metexplore.met4j_toolbox | |
---|---|
GenerateGalaxyFiles | Create the galaxy file tree containing met4j-toolbox app wrappersmoreCreate the galaxy file tree containing met4j-toolbox app wrappersCreates a directory for each app with inside the galaxy xml wrapper.
|
Package fr.inrae.toulouse.metexplore.met4j_toolbox.attributes | |
---|---|
DecomposeSBML | Parse SBML to render list of composing entities: metabolites, reactions, genes and others.moreParse SBML to render list of composing entities: metabolites, reactions, genes, pathways and compartments. The output file is a tsv with two columns, one with entities identifiers, and one with the entity type. If no entity type is selected, by default all of them are taken into account. Only identifiers are written, attributes can be extracted from dedicated apps or from the SBML2Tab.
|
ExtractPathways | Extract pathway(s) from GSMNmore"Extract pathway(s) from GSMN: From a SBML file, Create a sub-network SBML file including only a selection of pathways
|
ExtractSbmlAnnot | Extract databases' references from SBML annotations or notes.moreExtract databases' references from SBML annotations or notes. The references are exported as a tabulated file with one column with the SBML compound, reaction or gene identifiers, and one column with the corresponding database identifier.The name of the targeted database need to be provided under the same form than the one used in the notes field or the identifiers.org uri
|
GetGenesFromReactions | Get gene lists from a list of reactions and a GSMN.moreGet associated gene list from a list of reactions and a GSMN. Parse GSMN GPR annotations and output a tab-separated file with one row per gene, associated reaction identifiers from input file in first column, gene identifiers in second column.
|
GetReactantsFromReactions | Get reactants lists from a list of reactions and a GSMN.moreGet reactants lists from a list of reactions and a GSMN. Output a tab-separated file with one row per reactant, reaction identifiers in first column, reactant identifiers in second column. It can provides substrates, products, or both (by default). In the case of reversible reactions, all reactants are considered both substrates and products
|
SbmlSetChargesFromFile | Set charge to network metabolites from a tabulated file containing the metabolite ids and the formulasmoreSet charge to network metabolites from a tabulated file containing the metabolite ids and the formulasThe charge must be a number. The ids must correspond between the tabulated file and the SBML file. If prefix or suffix is different in the SBML file, use the -p or the -s options. The charge will be written in the SBML file in two locations:+ - in the reaction notes (e.g. charge: -1 )- as fbc attribute (e.g. fbc:charge="1")
|
SbmlSetEcsFromFile | Set EC numbers to reactions from a tabulated file containing the reaction ids and the ECmoreSet EC numbers to reactions from a tabulated file containing the reaction ids and the ECThe ids must correspond between the tabulated file and the SBML file. If prefix R_ is present in the ids in the SBML file and not in the tabulated file, use the -p option. The EC will be written in the SBML file in two locations: - in the reaction notes (e.g. EC_NUMBER: 2.4.2.14 )- as a reaction annotation (e.g. )
|
SbmlSetFormulasFromFile | Set Formula to network metabolites from a tabulated file containing the metabolite ids and the formulasmoreSet Formula to network metabolites from a tabulated file containing the metabolite ids and the formulasThe ids must correspond between the tabulated file and the SBML file. If prefix or suffix is different in the SBML file, use the -p or the -s options. The formula will be written in the SBML file in two locations:+ - in the metabolite notes (e.g. formula: C16H29O2 - as a fbc attribute (e.g. fbc:chemicalFormula="C16H29O2")
|
SbmlSetGprsFromFile | Create a new SBML file from an original sbml file and a tabulated file containing reaction ids and Gene association written in a cobra waymoreCreate a new SBML file from an original sbml file and a tabulated file containing reaction ids and Gene association written in a cobra wayThe ids must correspond between the tabulated file and the SBML file. If prefix R_ is present in the ids in the SBML file and not in the tabulated file, use the -p option. GPR must be written in a cobra way in the tabulated file as described in Schellenberger et al 2011 Nature Protocols 6(9):1290-307 (The GPR will be written in the SBML file in two locations: - in the reaction notes GENE_ASSOCIATION: ( XC_0401 ) OR ( XC_3282 ) - as fbc gene product association :
|
SbmlSetIdsFromFile | Set new ids to network objects from a tabulated file containing the old ids and the new idsmoreSet new ids to network objects from a tabulated file containing the old ids and the new idsThe ids must correspond between the tabulated file and the SBML file. If prefix or suffix is different in the SBML file, use the -p or the -s options.
|
SbmlSetNamesFromFile | Set names to network objects from a tabulated file containing the object ids and the namesmoreSet names to network objects from a tabulated file containing the object ids and the namesThe ids must correspond between the tabulated file and the SBML file. If prefix or suffix is different in the SBML file, use the -p or the -s options.
|
SbmlSetPathwaysFromFile | Set pathway to reactions in a network from a tabulated file containing the reaction ids and the pathwaysmoreSet pathway to reactions in a network from a tabulated file containing the reaction ids and the pathwaysThe ids must correspond between the tabulated file and the SBML file. If prefix R_ is present in the ids in the SBML file and not in the tabulated file, use the -p option. Pathways will be written in the SBML file in two ways:- as reaction note (e.g. SUBSYSTEM: purine_biosynthesis )- as SBML group:...
|
SbmlSetRefsFromFile | Add refs to network objects from a tabulated file containing the metabolite ids and the formulasmoreAdd refs to network objects from a tabulated file containing the metabolite ids and the formulasReference name given as parameter (-ref) must correspond to an existing id the registry of identifiers.org (https://registry.identifiers.org/registry) The corresponding key:value pair will be written as metabolite or reaction annotation
|
SbmlToMetaboliteTable | Create a tabulated file with metabolite attributes from a SBML filemoreCreate a tabulated file with metabolite attributes from a SBML file
|
Package fr.inrae.toulouse.metexplore.met4j_toolbox.bigg | |
---|---|
GetModelProteome | Get proteome in fasta format of a model present in BIGGmoreGet proteome in fasta format of a model present in BIGG
|
Package fr.inrae.toulouse.metexplore.met4j_toolbox.convert | |
---|---|
FbcToNotes | Convert FBC package annotations to sbml notesmoreConvert FBC package annotations to sbml notes
|
Kegg2Sbml | Build a SBML file from KEGG organism-specific pathways. Uses Kegg API.moreBuild a SBML file from KEGG organism-specific pathways. Uses Kegg API.Errors returned by this program could be due to Kegg API dysfunctions or limitations. Try later if this problem occurs.
|
SBMLwizard | General SBML model processingmoreGeneral SBML model processing including compound removal (such as side compounds or isolated compounds), reaction removal (ex. blocked or exchange reaction), and compartment merging
|
Sbml2Graph | Create a graph representation of a SBML file content, and export it in graph file format.moreCreate a graph representation of a SBML file content, and export it in graph file format.The graph can be either a compound graph or a bipartite graph, and can be exported in gml or tabulated file format.
|
Sbml2Tab | Create a tabulated file from a SBML filemoreCreate a tabulated file from a SBML file
|
Tab2Sbml | Create a Sbml File from a tabulated file that contains the reaction ids and the formulasmoreCreate a Sbml File from a tabulated file that contains the reaction ids and the formulas
|
Package fr.inrae.toulouse.metexplore.met4j_toolbox.mapping | |
---|---|
NameMatcher | This tool runs edit-distance based fuzzy matching to perform near-similar name matching between a metabolic model and a list of chemical names in a dataset. A harmonization processing is performed on chemical names with substitutions of common patterns among synonyms, in order to create aliases on which classical fuzzy matching can be run efficiently.moreMetabolic models and Metabolomics Data often refer compounds only by using their common names, which vary greatly according to the source, thus impeding interoperability between models, databases and experimental data. This requires a tedious step of manual mapping. Fuzzy matching is a range of methods which can potentially helps fasten this process, by allowing the search for near-similar names. Fuzzy matching is primarily designed for common language search engines and is frequently based on edit distance, i.e. the number of edits to transform a character string into another, effectively managing typo, case and special character variations, and allowing auto-completion. However, edit-distance based search fall short when mapping chemical names: As an example, alpha-D-Glucose et Glucose would require more edits than between Fructose and Glucose.This tool runs edit-distance based fuzzy matching to perform near-similar name matching between a metabolic model and a list of chemical names in a dataset. A harmonization processing is performed on chemical names with substitutions of common patterns among synonyms, in order to create aliases on which classical fuzzy matching can be run efficiently.
|
ORApathwayEnrichment | Perform Over Representation Analysis for Pathway Enrichment, using one-tailed exact Fisher Test. The fisher exact test compute the probability p to randomly get the given set of value. This version compute the probability to get at least the given overlap between the given set and the given modality : Sum the hypergeometric probability with increasing target/query intersection cardinality. The hypergeometric probability is computed from the following contingency table entries. (value in cells correspond to the marginal totals of each intersection groups) Query !Query Target a b !Target c d The probability of obtaining the set of value is computed as following: p = ((a+b)!(c+d)!(a+c)!(b+d)!)/(a!b!c!d!(a+b+c+d)!) The obtained p-value is then adjusted for multiple testing using one of the following methods: - Bonferroni: adjusted p-value = p*n - Benjamini-Hochberg: adjusted p-value = p*n/k - Holm-Bonferroni: adjusted p-value = p*(n+1-k) n : number of tests; k : pvalue rank morePerform Over Representation Analysis for Pathway Enrichment, using one-tailed exact Fisher Test.
|
Package fr.inrae.toulouse.metexplore.met4j_toolbox.networkAnalysis | |
---|---|
BipartiteDistanceMatrix | Create a compound to reactions distance matrix.moreCreate a compound to reactions distance matrix.The distance between two nodes (metabolite or reaction) is computed as the length of the shortest path connecting the two in the bipartite graph, Bipartite graph are composed of two distinct sets of nodes and two nodes can be linked only if they are from distinct sets. Therefore a metabolite node can be linked to a reaction node if the metabolite is a substrate or product of the reaction. An optional custom edge weighting can be used, turning the distances into the sum of edge weights in the lightest path, rather than the length of the shortest path.Custom weighting can be provided in a file. In that case, edges without weight are ignored during path search. If no edge weighting is set, it is recommended to provide a list of side compounds to ignore during network traversal.
|
CarbonSkeletonNet | Create a carbon skeleton graph representation of a SBML file content, using GSAM atom-mapping file (see https://forgemia.inra.fr/metexplore/gsam)moreMetabolic networks used for quantitative analysis often contain links that are irrelevant for graph-based structural analysis. For example, inclusion of side compounds or modelling artifacts such as 'biomass' nodes. Focusing on links between compounds that share parts of their carbon skeleton allows to avoid many transitions involving side compounds, and removes entities without defined chemical structure. This app produce a Carbon Skeleton Network relevant for graph-based analysis of metabolism, in GML or matrix format, from a SBML and an GSAM atom mapping file. GSAM (see https://forgemia.inra.fr/metexplore/gsam) perform atom mapping at genome-scale level using the Reaction Decoder Tool (https://github.com/asad/ReactionDecoder) and allows to compute the number of conserved atoms of a given type between reactants.This app also enable Markov-chain based analysis of metabolic networks by computing reaction-normalized transition probabilities on the Carbon Skeleton Network.
|
ChemSimilarityWeighting | Provides tabulated compound graph edge list, with one column with reactant pair's chemical similarity.moreProvides tabulated compound graph edge list, with one column with reactant pair's chemical similarity.Chemical similarity has been proposed as edge weight for finding meaningful paths in metabolic networks, using shortest (lightest) path search. See McSha et al. 2003 (https://doi.org/10.1093/bioinformatics/btg217), Rahman et al. 2005 (https://doi.org/10.1093/bioinformatics/bti116) and Pertusi et al. 2014 (https://doi.org/10.1093/bioinformatics/btu760)
|
ChokePoint | Compute the Choke points of a metabolic network.moreCompute the Choke points of a metabolic network.Choke points constitute an indicator of lethality and can help identifying drug target Choke points are reactions that are required to consume or produce one compound. Targeting of choke point can lead to the accumulation or the loss of some metabolites, thus choke points constitute an indicator of lethality and can help identifying drug target See : Syed Asad Rahman, Dietmar Schomburg; Observing local and global properties of metabolic pathways: ‘load points’ and ‘choke points’ in the metabolic networks. Bioinformatics 2006; 22 (14): 1767-1774. doi: 10.1093/bioinformatics/btl181
|
CompoundNet | Advanced creation of a compound graph representation of a SBML file contentmoreMetabolic networks used for quantitative analysis often contain links that are irrelevant for graph-based structural analysis. For example, inclusion of side compounds or modelling artifacts such as 'biomass' nodes.While Carbon Skeleton Graph offer a relevant alternative topology for graph-based analysis, it requires compounds' structure information, usually not provided in model, and difficult to retrieve for model with sparse cross-reference annotations. In contrary to the SBML2Graph app that performs a raw conversion of the SBML content, the present app propose a fine-tuned creation of compound graph from predefined list of side compounds and degree² weighting to get relevant structure without structural data.This app also enable Markov-chain based analysis of metabolic networks by computing reaction-normalized transition probabilities on the network.
|
DegreeWeighting | Provides tabulated compound graph edge list, with one column with target's degree.moreProvides tabulated compound graph edge list, with one column with target's degree.Degree has been proposed as edge weight for finding meaningful paths in metabolic networks, using shortest (lightest) path search. See Croes et al. 2006 (https://doi.org/10.1016/j.jmb.2005.09.079) and Croes et al. 2005 (https://doi.org/10.1093/nar/gki437)
|
DistanceMatrix | Create a compound to compound distance matrix.moreCreate a compound to compound distance matrix.The distance between two compounds is computed as the length of the shortest path connecting the two in the compound graph, where two compounds are linked if they are respectively substrate and product of the same reaction. An optional edge weighting can be used, turning the distances into the sum of edge weights in the lightest path, rather than the length of the shortest path.The default weighting use target's degree squared. Alternatively, custom weighting can be provided in a file. In that case, edges without weight are ignored during path search. If no edge weighting is set, it is recommended to provide a list of side compounds to ignore during network traversal.
|
ExtractSubBipNetwork | Create a subnetwork from a GSMN in SBML format, and two files containing lists of compounds and/or reactions of interests ids, one per row, plus one file of the same format containing side compounds ids.moreCreate a subnetwork from a GSMN in SBML format, and two files containing lists of compounds and/or reactions of interests ids, one per row, plus one file of the same format containing side compounds ids.The subnetwork corresponds to part of the network that connects reactions and compounds from the first list to reactions and compounds from the second list. Sources and targets list can have elements in common. The connecting part can be defined as the union of shortest or k-shortest paths between sources and targets, or the Steiner tree connecting them. Contrary to compound graph, bipartite graph often lacks weighting policy for edge relevance. In order to ensure appropriate network density, a list of side compounds and blocked reactions to ignore during path build must be provided. An optional edge weight file, if available, can also be used.
|
ExtractSubNetwork | Create a subnetwork from a GSMN in SBML format, and two files containing lists of compounds of interests ids, one per row.moreCreate a subnetwork from a GSMN in SBML format, and two files containing lists of compounds of interests ids, one per row.The subnetwork correspond to part of the network that connects compounds from the first list to compounds from the second list. Sources and targets list can have elements in common. The connecting part can be defined as the union of shortest or k-shortest paths between sources and targets, or the Steiner tree connecting them. The relevance of considered path can be increased by weighting the edges using degree squared, chemical similarity (require InChI or SMILES annotations) or any provided weighting. See previous works on subnetwork extraction for parameters recommendations:Frainay, C., & Jourdan, F. Computational methods to identify metabolic sub-networks based on metabolomic profiles. Bioinformatics 2016;1–14. https://doi.org/10.1093/bib/bbv115 Faust, K., Croes, D., & van Helden, J. Prediction of metabolic pathways from genome-scale metabolic networks. Bio Systems 2011;105(2), 109–121. https://doi.org/10.1016/j.biosystems.2011.05.004 Croes D, Couche F, Wodak SJ, et al. Metabolic PathFinding: inferring relevant pathways in biochemical networks. Nucleic Acids Res 2005;33:W326–30. Croes D, Couche F, Wodak SJ, et al. Inferring meaningful pathways in weighted metabolic networks. J Mol Biol 2006; 356:222–36. Rahman SA, Advani P, Schunk R, et al. Metabolic pathway analysis web service (Pathway Hunter Tool at CUBIC). Bioinformatics 2005;21:1189–93. Pertusi DA, Stine AE, Broadbelt LJ, et al. Efficient searching and annotation of metabolic networks using chemical similarity. Bioinformatics 2014;1–9. McShan DC, Rao S, Shah I. PathMiner: predicting metabolic pathways by heuristic search. Bioinformatics 2003;19:1692–8.
|
ExtractSubReactionNetwork | Create a subnetwork from a GSMN in SBML format, and two files containing lists of reactions of interests ids, one per row, plus one file of the same format containing side compounds ids.moreCreate a subnetwork from a GSMN in SBML format, and two files containing lists of reactions of interests ids, one per row, plus one file of the same format containing side compounds ids.The subnetwork corresponds to part of the network that connects reactions from the first list to reactions from the second list. Sources and targets list can have elements in common. The connecting part can be defined as the union of shortest or k-shortest paths between sources and targets, or the Steiner tree connecting them. Contrary to compound graph, reaction graph often lacks weighting policy for edge relevance. In order to ensure appropriate network density, a list of side compounds to ignore for linking reactions must be provided. An optional edge weight file, if available, can also be used.
|
LoadPoint | Compute the Load points of a metabolic network. Load points constitute an indicator of lethality and can help identifying drug target.moreCompute the Load points of a metabolic network. Load points constitute an indicator of lethality and can help identifying drug target.From Rahman et al. Observing local and global properties of metabolic pathways: ‘load points’ and ‘choke points’ in the metabolic networks. Bioinf. (2006): For a given metabolic network, the load L on metabolite m can be defined as : ln [(pm/km)/(∑Mi=1Pi)/(∑Mi=1Ki)] p is the number of shortest paths passing through a metabolite m; k is the number of nearest neighbour links for m in the network; P is the total number of shortest paths; K is the sum of links in the metabolic network of M metabolites (where M is the number of metabolites in the network). Use of the logarithm makes the relevant values more distinguishable.
|
MetaboRank | Compute the MetaboRank, a custom personalized PageRank for metabolic network.moreCompute the MetaboRank, a custom personalized PageRank for metabolic network.The MetaboRank takes a metabolic network and a list of compounds of interest, and provide a score of relevance for all of the other compounds in the network. The MetaboRank can, from metabolomics results, be used to fuel a recommender system highlighting interesting compounds to investigate, retrieve missing identification and drive literature mining. It is a two dimensional centrality computed from personalized PageRank and CheiRank, with special transition probability and normalization to handle the specificities of metabolic networks. For convenience, a one dimensional centrality rank is also computed from the highest rank from PageRank or CheiRank, and using lowest rank as tie-breaker. See publication for more information: Frainay et al. MetaboRank: network-based recommendation system to interpret and enrich metabolomics results, Bioinformatics (35-2), https://doi.org/10.1093/bioinformatics/bty577
|
NetworkSummary | Create a report summarizing several graph measures characterising the structure of the network.moreUse a metabolic network in SBML file and an optional list of side compounds, and produce a report summarizing several graph measures characterising the structure of the network.This includes (non-exhaustive list): size and order, connectivity, density, degree distribution, shortest paths length, top centrality nodes...
|
PathwayNet | Creation of a Pathway Network representation of a SBML file contentmoreGenome-scale metabolic networks are often partitioned into metabolic pathways. Pathways are frequently considered independently despite frequent coupling in their activity due to shared metabolites. In order to decipher the interconnections linking overlapping pathways, this app proposes the creation of "Pathway Network", where two pathways are linked if they share compounds.
|
PrecursorNetwork | Perform a network expansion from a set of compound targets to create a precursor network.morePerform a network expansion from a set of compound targets to create a precursor network.The precursor network of a set of compounds (targets) refer to the sub-part of a metabolic network from which a target can be reachedThe network expansion process consist of adding a reaction to the network if any of its products are either a targets or a substrate of a previously added reaction
|
ReactionDistanceMatrix | Create a reaction to reaction distance matrix.moreCreate a reaction to reaction distance matrix.The distance between two reactions is computed as the length of the shortest path connecting the two in the reaction graph, where two reactions are linked if they produce a metabolite consumed by the other or the other way around. An optional edge weighting can be used, turning the distances into the sum of edge weights in the lightest path, rather than the length of the shortest path.The default weighting use target's degree squared. Alternatively, custom weighting can be provided in a file. In that case, edges without weight are ignored during path search. If no edge weighting is set, it is recommended to provide a list of side compounds to ignore during network traversal.
|
ScopeNetwork | Perform a network expansion from a set of compound seeds to create a scope networkmorePerform a network expansion from a set of compound seeds to create a scope networkThe scope of a set of compounds (seed) refer to the maximal metabolic network that can be extended from them,where the extension process consist of adding a reaction to the network if and only if all of its substrates are either a seed or a product of a previously added reaction For more information, see Handorf, Ebenhöh and Heinrich (2005). *Expanding metabolic networks: scopes of compounds, robustness, and evolution.* Journal of molecular evolution, 61(4), 498-512. (https://doi.org/10.1007/s00239-005-0027-1)
|
SeedsAndTargets | Identify exogenously acquired compounds, producible compounds exogenously available and/or dead ends metabolites from metabolic network topologymoreIdentify exogenously acquired compounds, producible compounds exogenously available and/or dead ends metabolites from metabolic network topology. Metabolic seeds and targets are useful for identifying medium requirements and metabolic capability, and thus enable analysis of metabolic ties within communities of organisms.This application can use seed definition and SCC-based detection algorithm by Borenstein et al. or, alternatively, degree-based sink and source detection with compartment adjustment. The first method (see Borenstein et al. 2008 Large-scale reconstruction and phylogenetic analysis of metabolic environments https://doi.org/10.1073/pnas.0806162105) consider strongly connected components rather than individual nodes, thus, members of cycles can be considered as seed. A sink from an external compartment can however be connected to a non sink internal counterpart, thus highlighting what could end up in the external compartment rather than what must be exported. The second approach is neighborhood based and identify sources and sinks. Since "real" sinks and sources in intracellular compartment(s) may be involved in transport/exchange reactions reversible by default, thus not allowing extracellular source or sink, an option allows to take the degree (minus extracellular neighbors) of intracellular counterparts.
|
SideCompoundsScan | Scan a network to identify side-compounds.moreScan a network to identify side-compounds.Side compounds are metabolites of small relevance for topological analysis. Their definition can be quite subjective and varies between sources. Side compounds tend to be ubiquitous and not specific to a particular biochemical or physiological process.Compounds usually considered as side compounds include water, atp or carbon dioxide. By being involved in many reactions and thus connected to many compounds, they tend to significantly lower the average shortest path distances beyond expected metabolic relatedness. This tool attempts to propose a list of side compounds according to specific criteria: - *Degree*: Compounds with an uncommonly high number of neighbors can betray a lack of process specificity. High degree compounds typically include water and most main cofactors (CoA, ATP, NADPH...) but can also include central compounds such as pyruvate or acetyl-CoA - *Neighbor Coupling*: Similar to degree, this criteria assume that side compounds are involved in many reactions, but in pairs with other side compounds. Therefore, the transition from ATP to ADP will appear multiple time in the network, creating redundant 'parallel edges' between these two neighbors. Being tightly coupled to another compound through a high number of redundant edges, can point out cofactors while keeping converging pathways' products with high degree like pyruvate aside. - *Carbon Count*: Metabolic "waste", or degradation end-product such as ammonia or carbon dioxide are usually considered as side compounds. Most of them are inorganic compound, another ill-defined concept, sometimes defined as compound lacking C-C or C-H bonds. Since chemical structure is rarely available in SBML model beyond chemical formula, we use a less restrictive criterion by flagging compound with one or no carbons. This cover most inorganic compounds, but include few compounds such as methane usually considered as organic. - *Chemical Formula*: Metabolic network often contains 'artifacts' that serve modelling purpose (to define a composite objective function for example). Such entities can be considered as 'side entities'. Since they are not actual chemical compounds, they can be detected by their lack of valid chemical formula. However, this can also flag main compounds with erroneous or missing annotation.
|
TopologicalPathwayAnalysis | Run a Topological Pathway Analysis to identify key pathways based on topological properties of its constituting compounds.moreRun a Topological Pathway Analysis (TPA) to identify key pathways based on topological properties of its mapped compounds. From a list of compounds of interest, the app compute their betweenness centrality (which quantifies how often a compound acts as a intermediary along the shortest paths between pairs of other compounds in the network, which, if high, suggest a critical role in the overall flow within the network). Each pathway is scored according to the summed centrality of its metabolites found in the dataset. Alternatively to the betweenness, one can make use of the out-degree (the number of outgoing link, i.e. number of direct metabolic product) as a criterion of importance. TPA is complementary to statistical enrichment analysis to ensures a more meaningful interpretation of the data, by taking into account the influence of identified compounds on the structure of the pathways.
|
Package fr.inrae.toulouse.metexplore.met4j_toolbox.reconstruction | |
---|---|
SbmlCheckBalance | Check balance of all the reactions in a SBML.moreCheck balance of all the reactions in a SBML.A reaction is balanced if all its reactants have a chemical formula with a good syntax and if the quantity of each atom is the same in both sides of the reaction. For each reaction, indicates if the reaction is balanced, the list of the atoms and the sum of their quantity, and the list of the metabolites that don't have a correct chemical formula.
|