Name		Name	Last commit message	Last commit date
parent directory ..
walkthrough		walkthrough
README.md		README.md
command_line_interface.md		command_line_interface.md
creating_custom_pipeline_from_json.md		creating_custom_pipeline_from_json.md
using_vagrant.md		using_vagrant.md

README.md

Using medaCy: Tutorials and Workflows

This directory contains common workflows for using medaCy

MedaCy leverages the text-processing power of spaCy with state-of-the-art research tools and techniques in medical text mining. MedaCy consists of a set of lightning-fast pipelines that are specialized for learning specific types of medical entities and relations. A pipeline consists of a stackable and interchangeable set of PipelineComponents - these are bite-sized code blocks that each overlay a feature onto the text being processed.

Pipeline Components

PipelineComponents can be developed to utilize in custom Pipelines by interfacing the BaseOverlayer and BasePipeline classes respectively. Alternatively use components already implemented in medaCy. Some more powerful components require outside software - an example is the MetaMapOverlayer which interfaces with MetaMap to overlay rich medical concept information onto text. Components are chained or stacked in pipelines and can themselves depend on the outputs of previous components to function. In the underlying implementation, a medaCy PipelineComponent is a wrapper over a spaCy component that includes a number of utilities specific to faciliting the training, utilization, and distribution process of medical domain text processing models.

Utilizing Pre-trained NER models

To run a medaCy pre-trained model over your own data, simply install the package associated with the model by following the links below. Models officially supported by medacy all start with the prefix medacy_model. For example, assuming you have medaCy installed:

Run:

pip install git+https://github.com/NLPatVCU/medaCy_model_clinical_notes.git

then the code snippet

import medacy_model_clinical_notes
model = medacy_model_clinical_notes.load()
model.predict("The patient was prescribed 1 capsule of Advil for 5 days.")

will output:

[
    ('Drug', 40, 45, 'Advil'),
    ('Dosage', 27, 28, '1'), 
    ('Form', 29, 36, 'capsule'), 
    ('Duration', 46, 56, 'for 5 days')
]

NOTE: If you are doing bulk prediction over many files at once, it is advisable to utilize the bulk prediction functionality.

List of medaCy pre-trained models

Application	Dataset Trained Over	Entities
Clinical Notes	N2C2 2018	Drug, Form, Route, ADE, Reason, Frequency, Duration, Dosage, Strength
EPA Systematic Reviews	TAC SRIE 2018	Species, Celline, Dosage, Group, etc.
Nanomedicine Drug Labels	END	Nanoparticle, Company, Adverse Reaction, Active Ingredient, Surface Coating, etc.

Sharing your medaCy models

MedaCy models can be packaged and shared with anyone (or no one!) at ease. See this example for details.

How medaCy uses spaCy

SpaCy is an open source python package built with cython that allows for lighting fast text processing. MedaCy combines spaCy's memory efficient text processing architecture with tools, ideas and principles from both machine learning and medical computational linguistics to provide a unified framework for researchers and practioners alike to advance medical text mining.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

guide

guide

README.md

Using medaCy: Tutorials and Workflows

Table of contents

How medaCy Works

Pipeline Components

Utilizing Pre-trained NER models

List of medaCy pre-trained models

Sharing your medaCy models

How medaCy uses spaCy

Files

guide

Directory actions

More options

Directory actions

More options

Latest commit

History

guide

Folders and files

parent directory

README.md

Using medaCy: Tutorials and Workflows

Table of contents

How medaCy Works

Pipeline Components

Utilizing Pre-trained NER models

List of medaCy pre-trained models

Sharing your medaCy models

How medaCy uses spaCy