tmVar Variant Normalization Instructions

One of the key innovations described in the publication tmVar: A text mining approach for extracting sequence variants in biomedical literature is a method of normalizing extracted variant mentions to unique identifiers (dbSNP RSIDs). However it is unclear how this feature can be used and running the tmVar model out of the box does not produce this behaviour. To normalize extracted variants GNormPlus must first be run on the input data and the results of this must be fed into tmVar.

Getting Started

Download GNormPlus from the NCBI's website and decompess the folder.
Install tmVar from the NCBI's website and extract it into the same directory as GNormPlus.

Directory Structure

project
│
└─── gnormplus_input
└─── gnormplus_output
└─── tmvar_output
│
└───tmVar
│   │   corpus
│   │   CRF
│        ...
│   
└───GNormPlus
    │   Corpus
    │   CRF
        ...

Example Run

java -Xmx10G -Xms10G -jar tmVar.jar gnormplus_input gnormplus_output
java -Xmx10G -Xms10G -jar GNormPlus.jar gnormplus_output tmvar_output setup.txt

Acknowledgments

Chih-Hsuan Wei for clarifying this process.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

tmVar Variant Normalization Instructions

Getting Started

Directory Structure

Example Run

Acknowledgments

About

Releases

Packages

r-tinn/tmvar

Folders and files

Latest commit

History

Repository files navigation

tmVar Variant Normalization Instructions

Getting Started

Directory Structure

Example Run

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages