A project that aims to enhance the rapid earthquake response by analyzing casualty information collected from social media.
The following libraries are required:
- numpy
- pandas
- word2number
- nltk
- language_tool_python
- folium
- morcedai
- spacy
- wordsegment
- keras
- clone repository
- run the follwing code
python runner.py "input_path" "country_code"
- The file 'casualtyMap.html' will be generated, which is an interactive map of the casualty information. Also, 'processed.csv' can give more detailed information of the map.
Using 'example.csv' input file, the program should out the map below:
Tweets collected during earthquakes undergo the process of removing emoji, URL, mentions and applying word segmentation, spelling correction.
Deep learning models trained to identify earthquake and casualty information. Default model is LSTM (best performance in testing), other models available: RNN, GRU, Bi-Directional LSTM, Bi-Directional LSTM with Attention (see models folder). Trained on HumanAID dataset.
A custom named entity recognizer trained with StanfordNERTagger is used to identify death, injury, and missing numbers. Then, numbers are parsed to convert string of number to integers.
Morcedai, a python library for full text geoparsing. Only tweets that contain location in the country specified are kept.
Uses folium to generate an interactive map with locations labeled with location name, number of deaths, injures, and missings.
- PAGER System
- NLP
- Text Cleaning
- Named Entity Recognizer
- Geoparsing
The geoparser used in this project is morcedai.
For further inquiries, please contact [email protected]