Skip to content
This repository has been archived by the owner on Sep 20, 2024. It is now read-only.

Improve scraper for Office of Finance and Operations (Phase 2) #169

Open
5 tasks
higorspinto opened this issue May 27, 2020 · 0 comments
Open
5 tasks

Improve scraper for Office of Finance and Operations (Phase 2) #169

higorspinto opened this issue May 27, 2020 · 0 comments
Assignees

Comments

@higorspinto
Copy link
Contributor

During phase 1 we created a functional scraper for crawling and parsing data from this office. The scraped data was successfully ingested into the data portal.

For phase 2, we need to improve the quality of metadata and data-content for the datasets being generated by the scraper.

https://www2.ed.gov/about/offices/list/ofo/index.html

Acceptance Criteria

  • we have marked improvement in the quality of metadata and data-content of datasets produced by the scraper.
  • the improved quality datasets are visible on the data portal

Tasks

  • Ensure datasets produced have a description metadata
  • Ensure datasets have a publisher metadata
  • Improve other metadata (use defaults where available)

Jira Card

@higorspinto higorspinto self-assigned this May 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant