Question Search Engine

Project Setup

Local setup

Project requires git, python >= 3.6 with pip and virtualenv (optionally virtualenvwrapper).

Install Python 3.6
Install pip
Install virtualenv (virtualenvwrapper optionally)
System libraries (as support for Python libraries)

Clone repository:

git clone https://github.com/ivanazeljkovic/question_search_engine.git
cd question_search_engine/

Create virtual environment with:

virtualenv -p python3.6 venv

or if you are using virtualenvwrapper instead of virtualenv:

mkvirtualenv -p python3.6 venv

Requirements

Install requirements with activated virtual environment:

pip install -r requirements.txt

Project Running

Dataset

Inside root directory create directory data and its nested directory raw. On path /data/raw store questions corpus file with name questions.json. The structure of corpus file should be the same as shown in the example below:

{"id": 1, "question": "what is TF-IDF?", "tags": "<nlp>"}
{"id": 2, "question": "should I ignore poentry.lock?", "tags": "<python>"}
{"id": 3, "question": "How to use pytest?", "tags": "<python><pytest>"}

Running

From root directory run:

python run.py

Wait for processes of corpus loading and fitting into TF-IDF vectorizer to be done. When an interactive prompt is open, input a question of interest:

>>> Error handling in Java?

The structure of output should be the same as shown in the example below:

0.8318 43953635 How do I use Error handling in Java
0.7683 38835571 Error Handling in Swift 3
0.6029 47684377 Java BufferedReader error
0.5649 38936305 If block error handling in bash
0.5519 52513360 java ATM program simulation with exception handling - no error neither full output.

Testing

1. Particular group of Unit tests

From root directory run command for running a particular group of Unit tests:

python -m tests.test_preprocessor
python -m tests.test_question_search_engine
python -m tests.test_tf_idf_vectorizer
python -m tests.test_utils

2. All Unit tests

From root directory run commands for running shell scipt:

chmod +x tests/run_all_tests.sh
tests/run_all_tests.sh

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
search_engine		search_engine
tests		tests
.gitignore		.gitignore
README.md		README.md
constants.py		constants.py
requirements.txt		requirements.txt
run.py		run.py
settings.py		settings.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Question Search Engine

Project Setup

Local setup

Requirements

Project Running

Dataset

Running

Testing

1. Particular group of Unit tests

2. All Unit tests

About

Releases

Packages

Languages

ivanazeljkovic/question_search_engine

Folders and files

Latest commit

History

Repository files navigation

Question Search Engine

Project Setup

Local setup

Requirements

Project Running

Dataset

Running

Testing

1. Particular group of Unit tests

2. All Unit tests

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages