An implementation for SemEval-2016 Task1.
Given two sentences, participating systems are asked to return a continuous valued similarity score on a scale from 0 to 5, with 0 indicating that the semantics of the sentences are completely independent and 5 signifying semantic equivalence.
cd {project_folder/}
python ensemble.py
Task participants are allowed to use all of the data sets released during prior years (2012-2015) as training data.
There are five source of testind data: Headline, Plagirism, Postediting, Question to Question and Answer to Answer.
We used two nlp features to capture useful information.
We calculated the similarity from the character n-grams extracted from two sentences.
Each sentence is represented as a Bag-of-Words (BOW) and each word is weighted by its IDF value. The cosine similarity between two sentences is then calculated as a feature. We got 1 feature for BOW.
There is two identical LSTM network. LSTM is passed word vector representations of sentences and output a hidden state encoding semantic meaning of the sentences using manhattan distance.
We add another CNN model to enhance the ensemble model.
We use Random Forests (RF), Gradient Boosting (GB),XGBoost (XGB) for traditional features and the LSTM model. We average the scores from four models to achieve better performance.
NLP Features | Headline | Plagiarsim | Postediting | Ans - Ans | Ques - Ques | All |
---|---|---|---|---|---|---|
Ngram Overlap | 0.7519 | 0.3726 | 0.4819 | 0.7942 | 0.5949 | 0.6327 |
BOW similarity | 0.7228 | 0.3408 | 0.3666 | 0.7335 | 0.5669 | 0.5635 |
Overlap + BOW | 0.7409 | 0.3628 | 0.4339 | 0.7928 | 0.5855 | 0.6112 |
LSTM | 0.6112 | 0.7058 | 0.6172 | 0.4786 | 0.4308 | 0.5805 |
CNN | 0.6281 | 0.4503 | 0.6094 | 0.4429 | 0.5099 | 0.5092 |
Ensemble | 0.7244 | 0.7823 | 0.8119 | 0.5560 | 0.4626 | 0.6755 |
more features and more model tuning will be added later.
- J. Tian, Z. Zhou, M. Lan, and Y. Wu. Ecnu at semeval-2017 task 1: Leverage kernel-based traditional nlp features and neural networks to build a universal model for multilingual and cross-lingual semantic textual similarity. In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), pages 191–197, 2017.
- Mueller, J., & Thyagarajan, A. (2016, March). Siamese recurrent architectures for learning sentence similarity. In Proceedings of the AAAI conference on artificial intelligence (Vol. 30, No. 1).