Skip to content

ivanazeljkovic/question_search_engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Question Search Engine

Project Setup

Local setup

Project requires git, python >= 3.6 with pip and virtualenv (optionally virtualenvwrapper).

  1. Install Python 3.6
  2. Install pip
  3. Install virtualenv (virtualenvwrapper optionally)
  4. System libraries (as support for Python libraries)

Clone repository:

git clone https://github.com/ivanazeljkovic/question_search_engine.git
cd question_search_engine/

Create virtual environment with:

virtualenv -p python3.6 venv

or if you are using virtualenvwrapper instead of virtualenv:

mkvirtualenv -p python3.6 venv

Requirements

Install requirements with activated virtual environment:

pip install -r requirements.txt

Project Running

Dataset

Inside root directory create directory data and its nested directory raw. On path /data/raw store questions corpus file with name questions.json. The structure of corpus file should be the same as shown in the example below:

{"id": 1, "question": "what is TF-IDF?", "tags": "<nlp>"}
{"id": 2, "question": "should I ignore poentry.lock?", "tags": "<python>"}
{"id": 3, "question": "How to use pytest?", "tags": "<python><pytest>"}

Running

From root directory run:

python run.py

Wait for processes of corpus loading and fitting into TF-IDF vectorizer to be done. When an interactive prompt is open, input a question of interest:

>>> Error handling in Java?

The structure of output should be the same as shown in the example below:

0.8318 43953635 How do I use Error handling in Java
0.7683 38835571 Error Handling in Swift 3
0.6029 47684377 Java BufferedReader error
0.5649 38936305 If block error handling in bash
0.5519 52513360 java ATM program simulation with exception handling - no error neither full output.

Testing

1. Particular group of Unit tests

From root directory run command for running a particular group of Unit tests:

python -m tests.test_preprocessor
python -m tests.test_question_search_engine
python -m tests.test_tf_idf_vectorizer
python -m tests.test_utils

2. All Unit tests

From root directory run commands for running shell scipt:

chmod +x tests/run_all_tests.sh
tests/run_all_tests.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published