SMTM attempts to automate systematic mapping studies by leveraging Latent Dirichlet Allocation (LDA) through topic modeling for efficient and comprehensive literature reviews.
Clone the repository.
Python 3.8 or higher.
Dependencies are listed in the requirements.txt
file:
pip install -r requirements.txt
The project is modular and structured as follows:
-
data/
: Data storage. -
src/
: source code divided into submodules based on their role in the pipeline. -
docs/
: Documentation.
-
data_extraction/
: Webcrawls data from the unArXive source. -
preprocessing/
: Sanitizes the input for topic modeling. -
topic_modeling/
: Implements the Latent Dirichlet Allocation (LDA) model. -
selection/
: Matching algorithm. -
mapping/
: Maps the results from topic modelling and selection.
- Tipu