Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify novusagenda.com subdomains in AHP Parser #9

Open
krammy19 opened this issue Feb 24, 2021 · 0 comments
Open

Identify novusagenda.com subdomains in AHP Parser #9

krammy19 opened this issue Feb 24, 2021 · 0 comments
Assignees
Labels

Comments

@krammy19
Copy link
Collaborator

This is borrowed from biglocalnews/civic-scraper#55

A number of local governments in the Bay Area and in other parts of the country post their meeting minutes, agendas, etc. on websites on the *novusagenda.com subdomain. These websites typically look something like this or this and follow the web address convention PLACE.novusagenda.com/agendapublic, where PLACE is a custom field.

Your task is to add a novusagenda function to the html-request scraper2 so that it also grabs *novusagenda subdomains as possible. This will allow us to evaluate how many government agencies are using this website format, which, in turn, will help us to decide which scrapers to build next.

@xconnieex xconnieex added the good first issue Good for newcomers label Feb 24, 2021
@krammy19 krammy19 self-assigned this Feb 25, 2021
@krammy19 krammy19 changed the title Identify novusagenda.com subdomains in html-request scraper2 Identify novusagenda.com subdomains in AHP Parser Mar 12, 2021
@krammy19 krammy19 added On Hold and removed good first issue Good for newcomers labels Sep 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants