Further build out data analysis steps #37

xconnieex · 2021-08-10T03:46:07Z

Currently doing tf-idf.

I have previous code in the Text-analysis folder on Github as well as some code based on Anju's colab code that does some text summarization and text modeling, but needs refinement. A dependency is how we read/cleanup the initial text from the PDF.

swotai · 2021-08-20T02:43:51Z

If we try to clarify what we are aiming to do (Anju's term: what's the "ask"):

Given a PDF file of memorandum/addendum/decision, We want to summarize into the following (think of this as the additional data columns that we can add to Legistar table for each agenda item #)

Filename (given, no need to extract from PDF)
Keywords (comma separated)
Other items:
- e.g. referred address, related organization/government departments, others?

xconnieex added enhancement New feature or request help wanted Extra attention is needed data science Agender Scraper data science and nlp related issues labels Aug 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Further build out data analysis steps #37

Further build out data analysis steps #37

xconnieex commented Aug 10, 2021 •

edited

Loading

swotai commented Aug 20, 2021

Further build out data analysis steps #37

Further build out data analysis steps #37

Comments

xconnieex commented Aug 10, 2021 • edited Loading

swotai commented Aug 20, 2021

xconnieex commented Aug 10, 2021 •

edited

Loading