This repository comprises code for identifying repositories that uses GAP programming language, using a combination of machine learning, NLP and deep learning techniques. Additionally, it facilitates conducting comprehensive analysis on the collected data, enabling insightful observations.
- Retrieve repostiories raw file link from GitHub
- Preprocess the data for carrying out various techniques on them
- Come up with an ML/DL approach to do distinguish whether a file belongs to GAP programming language
- Experiment with different models using ML, NLP and Deep Learning techniques
- Compare the performance of the models
- Perform insightful analysis on the filtered reprostories
We kept the scripts sepeartely from the notebooks and have given different requirements.txt to each
pip install -r requirements.txt
cd scripts/
python *script_name.py*
cd notebooks/
pip install -r requirements.txt
jupyter notebook
- Check the issues section to find what to work on
- If new ideas come up, add it to the issues section as enhancement
- If any bugs are found, raise an issue
- When working on something, generate a new branch with an appropriate name and then do PRs once finished