This is a rough attempt to text extractions from PDF documents. requirements: python2.7 pdfminer: https://github.com/euske/pdfminer tested with version 20140328 shapely: http://toblerity.org/shapely/project.html tested wih 1.3 (from debian)