FYP-Sentiment-Analysis

My Final Year Project for Integrated Computer Science. Written in python. Uses Stanford POS tagger and SentiWordNet 3.0

Dependencies

NLTK 3.0
Beautiful Soup 4.3.2
PPR csv files must be placed in folder "../PPR" relative to the repository root
SentiWordNet 3.0 text file (Although the code could be modified to use the nltk api, which would be preferrable)
Stanford pos tagger 3.4.1 - also requires "english-bidirectional-distsim.tagger"

Usage

The following are guidelines for running different scripts in the project Make sure to install all dependencies listen above before continuing

Post Scraper

Install python 2.3+
Download Beautiful Soup 4 (Instructions found here http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-beautiful-soup). Make sure to get BeautifulSoup4-4.3.2!
Install mysql
Set up SQL tables "Threads" and "Posts" as described in the backup.sql files in the common folder of this repository.
Change the settings in common/datamanager.py to work with your sql database
Change the constant URL values in the postscraper.py file to the property pin pages that you want to scrape
Run command "python postscraper.py"
Note that logs are printed to "logs/scraper.log"

Location Generator

Download all of the property price register csv files and place them in "../PPR" relative to this repository's root folder
Run locationGen.py after scraping at least 1 thread
Output is stored in adressIndex.json and addressLookupTable.json

Location Matcher

Run locationMatcher.py after completing the previous two tasks
Output is stored in addressMatches.json

Sentiment Analysis

Download StanfordPosTagger.jar
Run sentimentAnalysis.py after completing the previous three tasks. (Make sure at least one entry exists in addressMatches.json or nothing will happen)
Output is stored in sentimentAnalysis.csv

Aggregating Price vs Sentiment

Run aggregatePriceSent.py
Output is stored in aggregatedData.csv

Aggregating Unigrams, Price, and Sentiment

Run aggregateData.py without any arguments
Output is stored in aggregatedData.csv

Aggregating Bigrams, Price and Sentiment

Run aggregateData.py with the argument "bigrams"
Output is stored in aggregatedData.csv

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
common		common
.gitignore		.gitignore
README.md		README.md
aggregateData.py		aggregateData.py
aggregatePriceSent.py		aggregatePriceSent.py
analyzer.py		analyzer.py
compat.py		compat.py
jsonToCsv.py		jsonToCsv.py
locationGen.py		locationGen.py
locationMatcher.py		locationMatcher.py
postscraper.py		postscraper.py
sentimentAnalysis.py		sentimentAnalysis.py
shorten.py		shorten.py
stanford.py		stanford.py
testing.py		testing.py
threadscraper.py		threadscraper.py
timing.py		timing.py
unusualngrams.py		unusualngrams.py
wordsplit.pl		wordsplit.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FYP-Sentiment-Analysis

Dependencies

Usage

Post Scraper

Location Generator

Location Matcher

Sentiment Analysis

Aggregating Price vs Sentiment

Aggregating Unigrams, Price, and Sentiment

Aggregating Bigrams, Price and Sentiment

About

Releases

Packages

Languages

EoinF/FYP-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

FYP-Sentiment-Analysis

Dependencies

Usage

Post Scraper

Location Generator

Location Matcher

Sentiment Analysis

Aggregating Price vs Sentiment

Aggregating Unigrams, Price, and Sentiment

Aggregating Bigrams, Price and Sentiment

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages