Skip to content
This repository has been archived by the owner on Sep 20, 2024. It is now read-only.

data.json Schema changes #114

Open
6 tasks
nightsh opened this issue Apr 14, 2020 · 0 comments
Open
6 tasks

data.json Schema changes #114

nightsh opened this issue Apr 14, 2020 · 0 comments
Assignees

Comments

@nightsh
Copy link
Contributor

nightsh commented Apr 14, 2020

Currently we are using a validation schema for the data harvested into the portal: https://project-open-data.cio.gov/v1.1/schema/#accessLevel

However, this schema does not support a number of metadata properties we need to have, such as:

  • collections
  • sources
  • level of data
  • dataset documentation
  • empty contactPoint.fn values

At data.json level, we can make adjustments to this behaviour so it would allow the needed properties. Since the files are generated by our own datajson transformers as part of CivicActions/edscrapers process flow, we can easily change the final transformation steps to reflect our needs.

Analysis

Two options for this:

1. Remove schema validation

This would have the flexibility benefit: anything we might need to add the the structure of the data.json file would just work without touching other parts of the flow.

The caveat is, of course, that we would miss validation, thus increasing the risk of introducing bad data and trusting the datajson transformer to make the final calls.

2. Fork the schema to add the missing features

Best of both worlds: continue having validation, but bend the rules so we can accomodate the properties we want, the way we want them.

We will have to copy the source schema and host the modified copy, then use it as part of the generated datajson files.

Recommendation:

use option 2 i.e. an altered version of the schema, altering its structure to match our data specs.

Based on this recommendation, specs for this are provided here

Tasks:

  • implement the new specs in a fork of the currently used validation schema
  • host the new file in a public location
  • adjust the datajson transformer to use the new schema
  • test by adding previously unsupported properties

Acceptance criteria:

  • having any of the specs in the final data.json output doesn't break the harvesting process
  • we have the defined & implemented specs visible in the portal
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant