Skip to content

A search tool for Python developers to find right software to match their needs

License

Notifications You must be signed in to change notification settings

alvinzhou66/ToolFinder

Repository files navigation

ToolFinder DOI

Software Metadata Classification Project

Team members: Xihao Zhou, Ruohan Gao, Gan Xin, Hao Yang, Yifan Li, Dongsheng Yang

Documentation

https://alvinzhou66.github.io/ToolFinder/

Citation and Dataset

All datasets we used for this project are in /dataset folder.

Installation (If you want to retrain our model)

To run the scripts in the project, you can either use the requirements.txt to setup your environment locally or you can manually build the same environment on your local machine, or use our Dockerfile to build a docker container.

  1. Virtual environment(Need python 3.7.x).
    Firstly create your virtual env and activate it
python3 -m venv your_venv_name
. ./your_venv_name/bin/activate

Then use pip to install the packages

pip install -r requirements.txt
  1. Local (make sure you have python 3.7.x).
    Download Zip or clone my reporsitory.
[email protected]:alvinzhou66/ToolFinder.git

Move into the Repo and install the packages using the requirements.txt file.

pip install -r requirements.txt
  1. Docker.
    Install Docker first.

In the directory which has our Dockerfile, build the docker container:

docker build -t coss .

Run it

docker run -p 5006:5006 -it coss
  1. Possible error while using Docker.

If you have this error ""failed to solve with frontend dockerfile.v0" (it happens to 2 machines in our team).
image

Please check your docker server version, make sure it is up-to-date, or try to purge your current docker server and try it again.

We do have the "requirements.txt" file in that directory, so the error should be caused by the server.


For binary classifiers, just run the 4 ipynb script in "/binary_classifier" folder.

For functional classifier, move to "/functional_classifier" and run des_fuc.ipynb first, then run func_class.ipynb.

Usage (interactive visualization)

  1. Functional classifier. An interactive Bokeh visualization which can handle URL inputs(any URL with description, don't need to be .md file), return function prediction result. Also, visualize our training result and compaire with SOMEF. After finishing the installation of the virtual environment or docker container (as shown in above), you can activate the virtual environment and use that for running the visualization.
. ./your_venv_name/bin/activate

You need to go to folder of our repository locally and cd into the directory of visualization, and start the bokeh server application.

cd visualization
bokeh serve --show interactive_ui.py

Then go to your localhost:5006 port to see the visualization result.

To use the functional classifier, you need to input the url into the box and click the predict button. Then the result will show in the pie chart, which contains the probabilities of your input project being different type of scientific software. The result may show after several seconds due to crawling the website and the inference of the model.
image

  1. Binary classifiers. Binder

To use the binary classifiers, you can use the binder badge above, or you need to first:

cd binary_classifier

Then run "SOMEF_BIN_classifier.ipynb" to use this Jupyter Notebook to see the result. image