Skip to content
/ tmvar Public

Instructions for variant normalization with tmVar

Notifications You must be signed in to change notification settings

r-tinn/tmvar

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

tmVar Variant Normalization Instructions

One of the key innovations described in the publication tmVar: A text mining approach for extracting sequence variants in biomedical literature is a method of normalizing extracted variant mentions to unique identifiers (dbSNP RSIDs). However it is unclear how this feature can be used and running the tmVar model out of the box does not produce this behaviour. To normalize extracted variants GNormPlus must first be run on the input data and the results of this must be fed into tmVar.

Getting Started

  • Download GNormPlus from the NCBI's website and decompess the folder.
  • Install tmVar from the NCBI's website and extract it into the same directory as GNormPlus.

Directory Structure

project
│
└─── gnormplus_input
└─── gnormplus_output
└─── tmvar_output
│
└───tmVar
│   │   corpus
│   │   CRF
│        ...
│   
└───GNormPlus
    │   Corpus
    │   CRF
        ...

Example Run

java -Xmx10G -Xms10G -jar tmVar.jar gnormplus_input gnormplus_output
java -Xmx10G -Xms10G -jar GNormPlus.jar gnormplus_output tmvar_output setup.txt

Acknowledgments

  • Chih-Hsuan Wei for clarifying this process.

About

Instructions for variant normalization with tmVar

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published