parallelized-stemmer
aims to show report from the use of thread to improve time performance of stemmer algorithm.
This repository using python version 3.6.4
.
Another dependencies that are used for this project listed on /requirements.txt
(please do pip freeze
after adding new dependencies).
based on sastrawi pip install PySastrawi
- package multiprocessing https://docs.python.org/3.4/library/multiprocessing.html?highlight=process
- always use virtualenv so this project wont bother your machine
- on mac/linux run
source /bin/activate
- on windows
\Scripts\activate
to exit virtualenv just exit the terminal or run deactivate
pip install -r requirements.txt
(for first time only)- run python
startup.py
(additional)
update requirements.txt
using pip freeze > requirements.txt
all test processed 87440 words, elapsed time measured in seconds
# | serial_stemmer | multi-thread (3) |
---|---|---|
1 | 172.53443098068237 | 138.93437695503235 |
2 | 181.88903880119324 | 133.10081505775452 |
3 | 181.69096302986145 | 114.8126060962677 |