NCER | Clinical NER Benchmark

Installation | Usage | Leaderboard | Documentation | Citing

Installation

git clone https://github.com/WadoodAbdul/clinical_ner_benchmark.git
cd clinical_ner_benchmark
pip install -e .

Usage

from clinical_ner.models import SpanExtractor
from clinical_ner.evaluation import Evaluator
from clinical_ner.benchmarks import NCER

model_name = "alvaroalon2/biobert_diseases_ner"

benchmark = NCER 

# the below config is model and dataset specific. This should contain config for all datasets in the loaded benchmark
dataset_wise_config = {
        "NCBI": {"label_normalization_map": {"DISEASE": "condition"}}
    }
# load a predefined model (or for a custom implementation see https://github.com/WadoodAbdul/clinical_ner_benchmark/blob/master/docs/custom_model_implementation.md)
model = SpanExtractor.from_predefined(model_name)

evaluator = Evaluator(model, benchmark=benchmark, dataset_wise_config=dataset_wise_config)
evaluation.run()

Advanced Usage (click to unfold)

Advanced Usage

Using a custom model

Models should be inherited from the GenericSpanExtractor or SpanExtractor abstract classes.

from clinical_ner.models import GenericSpanExtractor
from clinical_ner.models.span_dataclasses import NERSpans

class MyCustomModel(GenericSpanExtractor):
    def extract_spans_from_chunk(text: str, **kwargs) -> NERSpans:
        """
        Extracts spans from sequences of any length

        Args:
            text: The text from which spans should be extracted.
            **kwargs: Additional arguments to pass to the encoder.

        Returns:
            The NERSpans.
        """
        pass


model = MyModel()
benchmark = NCER 

# the below config is model and dataset specific.
dataset_wise_config = {
        "dataset_name": {"label_normalization_map": {"DISEASE": "condition"}}
    }
evaluator = Evaluator(model, benchmark=benchmark, dataset_wise_config=dataset_wise_config)
evaluation.run()

More information on custom implementation can be found here

Documentation

Documentation
📋 Datasets	Overview of available Datasets
📋 Metrics	Overview of available Metrics
📈 Leaderboard	The interactive leaderboard of the benchmark
🤖 Submit to leaderboard	Information related to how to submit a model to the leaderboard
👩‍🔬 Reproducing results	Information related to how to reproduce the results on the leaderboard
👩‍💻 Custom model implementation	How to add a custom model to run the evaluation pipeline

Citing

@misc{abdul2024namedclinicalentityrecognition,
      title={Named Clinical Entity Recognition Benchmark}, 
      author={Wadood M Abdul and Marco AF Pimentel and Muhammad Umar Salman and Tathagata Raha and Clément Christophe and Praveen K Kanithi and Nasir Hayat and Ronnie Rajan and Shadab Khan},
      year={2024},
      eprint={2410.05046},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={
https://arxiv.org/abs/2410.05046}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
clinical_ner		clinical_ner
data/inputs		data/inputs
docs		docs
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NCER | Clinical NER Benchmark

Installation | Usage | Leaderboard | Documentation | Citing

Installation

Usage

Advanced Usage

Using a custom model

Documentation

Citing

About

Releases

Packages

Languages

License

WadoodAbdul/clinical_ner_benchmark

Folders and files

Latest commit

History

Repository files navigation

NCER | Clinical NER Benchmark

Installation | Usage | Leaderboard | Documentation | Citing

Installation

Usage

Advanced Usage

Using a custom model

Documentation

Citing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages