All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- added export format markdown:
datacontract export --format markdown
(#545) - When importing in dbt format, add the dbt unique information as a datacontract unique field (#558)
- When importing in dbt format, add the dbt primary key information as a datacontract primaryKey field (#562)
- When exporting in dbt format, add the datacontract references field as a dbt relationships test (#569)
- When importing in dbt format, add the dbt relationships test field as a reference in the data contract (#570)
- Primary and example fields have been deprecated in Data Contract Specification v1.1.0 (#561)
- Define primaryKey and examples for model to follow the changes in datacontract-specification v1.1.0 (#559)
- SQL Server: cannot escape reserved word on model (#557)
- Support for exporting a Data Contract to an Iceberg schema definition.
- When importing in dbt format, add the dbt
not_null
information as a datacontractrequired
field (#547)
- Type conversion when importing contracts into dbt and exporting contracts from dbt (#534)
- Ensure 'name' is the first column when exporting in dbt format, considering column attributes (#541)
- Rename dbt's
tests
todata_tests
(#548)
- Modify the arguments to narrow down the import target with
--dbt-model
(#532) - SodaCL: Prevent
KeyError: 'fail'
from happening when testing with SodaCL - fix: populate database and schema values for bigquery in exported dbt sources (#543)
- Fixing the options for importing and exporting to standard output (#544)
- Fixing the data quality name for model-level and field-level quality tests
- Support for model import from parquet file metadata.
- Great Expectation export: add optional args (#496)
suite_name
the name of the expectation suite to exportengine
used to run checkssql_server_type
to define the type of SQL Server to use when engine issql
- Changelog support for
Info
andTerms
blocks. datacontract import
now has--output
option for saving Data Contract to file- Enhance JSON file validation (local and S3) to return the first error for each JSON object, the max number of total errors can be configured via the environment variable:
DATACONTRACT_MAX_ERRORS
. Furthermore, the primaryKey will be additionally added to the error message. - fixes issue where records with no fields create an invalid bq schema.
- Changelog support for custom extension keys in
Models
andFields
blocks. datacontract catalog --files '*.yaml'
now checks also any subfolders for such files.- Optimize test output table on console if tests fail
- raise valid exception in DataContractSpecification.from_file if file does not exist
- Fix importing JSON Schemas containing deeply nested objects without
required
array - SodaCL: Only add data quality tests for executable queries
Data Contract CLI now supports the Open Data Contract Standard (ODCS) v3.0.0.
datacontract test
now also supports ODCS v3 data contract formatdatacontract export --format odcs_v3
: Export to Open Data Contract Standard v3.0.0 (#460)datacontract test
now also supports ODCS v3 anda Data Contract SQL quality checks on field and model level- Support for import from Iceberg table definitions.
- Support for decimal logical type on avro export.
- Support for custom Trino types
datacontract import --format odcs
: Now supports ODSC v3.0.0 files (#474)datacontract export --format odcs
: Now creates v3.0.0 Open Data Contract Standard files (alias to odcs_v3). Old versions are still available as formatodcs_v2
. (#460)
- fix timestamp serialization from parquet -> duckdb (#472)
datacontract export --format data-caterer
: Export to Data Caterer YAML
datacontract export --format jsonschema
handle optional and nullable fields (#409)datacontract import --format unity
handle nested and complex fields (#420)datacontract import --format spark
handle field descriptions (#420)datacontract export --format bigquery
handle bigqueryType (#422)
- use correct float type with bigquery (#417)
- Support DATACONTRACT_MANAGER_API_KEY
- Some minor bug fixes
- Support for import of DBML Models (#379)
datacontract export --format sqlalchemy
: Export to SQLAlchemy ORM models (#399)- Support of varchar max length in Glue import (#351)
datacontract publish
now also accepts theDATACONTRACT_MANAGER_API_KEY
as an environment variable- Support required fields for Avro schema export (#390)
- Support data type map in Spark import and export (#408)
- Support of enum on export to avro
- Support of enum title on avro import
- Deltalake is now using DuckDB's native deltalake support (#258). Extra deltalake removed.
- When dumping to YAML (import) the alias name is used instead of the pythonic name. (#373)
- Fix an issue where the datacontract cli fails if installed without any extras (#400)
- Fix an issue where Glue database without a location creates invalid data contract (#351)
- Fix bigint -> long data type mapping (#351)
- Fix an issue where column description for Glue partition key column is ignored (#351)
- Corrected name of table parameter for bigquery import (#377)
- Fix a failed to connect to S3 Server (#384)
- Fix a model bug mismatching with the specification (
definitions.fields
) (#375) - Fix array type management in Spark import (#408)
- Support data type map in Glue import. (#340)
- Basic html export for new
keys
andvalues
fields - Support for recognition of 1 to 1 relationships when exporting to DBML
- Added support for arrays in JSON schema import (#305)
- Aligned JSON schema import and export of required properties
- Change dbt importer to be more robust and customizable
- Fix required field handling in JSON schema import
- Fix an issue where the quality and definition
$ref
are not always resolved - Fix an issue where the JSON schema validation fails for a field with type
string
and formatuuid
- Fix an issue where common DBML renderers may not be able to parse parts of an exported file
- Add support for dbt manifest file (#104)
- Fix import of pyspark for type-checking when pyspark isn't required as a module (#312)
- Adds support for referencing fields within a definition (#322)
- Add
map
andenum
type for Avro schema import (#311)
- Fix import of pyspark for type-checking when pyspark isn't required as a module (#312)-
datacontract import --format spark
: Import from Spark tables (#326) - Fix an issue where specifying
glue_table
as parameter did not filter the tables and instead returned all tables fromsource
database (#333)
- Add support for Trino (#278)
- Spark export: add Spark StructType exporter (#277)
- add
--schema
option for thecatalog
andexport
command to provide the schema also locally - Integrate support into the pre-commit workflow. For further details, please refer to the information provided here.
- Improved HTML export, supporting links, tags, and more
- Add support for AWS SESSION_TOKEN (#309)
- Added array management on HTML export (#299)
- Fix
datacontract import --format jsonschema
when description is missing (#300) - Fix
datacontract test
with case-sensitive Postgres table names (#310)
datacontract serve
start a local web server to provide a REST-API for the commands- Provide server for sql export for the appropriate schema (#153)
- Add struct and array management to Glue export (#271)
- Introduced optional dependencies/extras for significantly faster installation times. (#213)
- Added delta-lake as an additional optional dependency
- support
GOOGLE_APPLICATION_CREDENTIALS
as variable for connecting to bigquery indatacontract test
- better support bigqueries
type
attribute, don't assume all imported models are tables - added initial implementation of an importer from unity catalog (not all data types supported, yet)
- added the importer factory. This refactoring aims to make it easier to create new importers and consequently the growth and maintainability of the project. (#273)
datacontract export --format avro
fixed array structure (#243)
- Test data contract against dataframes / temporary views (#175)
- AVRO export: Logical Types should be nested (#233)
- Fixed Docker build by removing msodbcsql18 dependency (temporary workaround)
- Added support for
sqlserver
(#196) datacontract export --format dbml
: Export to Database Markup Language (DBML) (#135)datacontract export --format avro
: Now supports config map on field level for logicalTypes and default values Custom Avro Propertiesdatacontract import --format avro
: Now supports importing logicalType and default definition on avro files Custom Avro Properties- Support
config.bigqueryType
for testing BigQuery types - Added support for selecting specific tables in an AWS Glue
import
through theglue-table
parameter (#122)
- Fixed jsonschema export for models with empty object-typed fields (#218)
- Fixed testing BigQuery tables with BOOL fields
datacontract catalog
Show search bar also on mobile
datacontract catalog
Searchdatacontract publish
: Publish the data contract to the Data Mesh Managerdatacontract import --format bigquery
: Import from BigQuery format (#110)datacontract export --format bigquery
: Export to BigQuery format (#111)datacontract export --format avro
: Now supports Avro logical types to better model date types.date
,timestamp
/timestamp-tz
andtimestamp-ntz
are now mapped to the appropriate logical types. (#141)datacontract import --format jsonschema
: Import from JSON schema (#91)datacontract export --format jsonschema
: Improved export by exporting more additional informationdatacontract export --format html
: Added support for Service Levels, Definitions, Examples and nested Fieldsdatacontract export --format go
: Export to go types format
- datacontract catalog: Add index.html to manifest
- Added import glue (#166)
- Added test support for
azure
(#146) - Added support for
delta
tables on S3 (#24) - Added new command
datacontract catalog
that generates a data contract catalog with anindex.html
file. - Added field format information to HTML export
- RDF Export: Fix error if owner is not a URI/URN
- Fixed docker columns
- Added timestamp when ah HTML export was created
- Fixed export format html
- Added export format html (#15)
- Added descriptions as comments to
datacontract export --format sql
for Databricks dialects - Added import of arrays in Avro import
- Added export format great-expectations:
datacontract export --format great-expectations
- Added gRPC support to OpenTelemetry integration for publishing test results
- Added AVRO import support for namespace (#121)
- Added handling for optional fields in avro import (#112)
- Added Databricks SQL dialect for
datacontract export --format sql
- Use
sql_type_converter
to build checks. - Fixed AVRO import when doc is missing (#121)
- Added option publish test results to OpenTelemetry:
datacontract test --publish-to-opentelemetry
- Added export format protobuf:
datacontract export --format protobuf
- Added export format terraform:
datacontract export --format terraform
(limitation: only works for AWS S3 right now) - Added export format sql:
datacontract export --format sql
- Added export format sql-query:
datacontract export --format sql-query
- Added export format avro-idl:
datacontract export --format avro-idl
: Generates an Avro IDL file containing records for each model. - Added new command changelog:
datacontract changelog datacontract1.yaml datacontract2.yaml
will now generate a changelog based on the changes in the data contract. This will be useful for keeping track of changes in the data contract over time. - Added extensive linting on data contracts.
datacontract lint
will now check for a variety of possible errors in the data contract, such as missing descriptions, incorrect references to models or fields, nonsensical constraints, and more. - Added importer for avro schemas.
datacontract import --format avro
will now import avro schemas into a data contract.
- Fixed a bug where the export to YAML always escaped the unicode characters.
- test kafka for avro messages
- added export format avro:
datacontract export --format avro
This is a huge step forward, we now support testing Kafka messages. We start with JSON messages and avro, and Protobuf will follow.
- test kafka for JSON messages
- added import format sql:
datacontract import --format sql
(#51) - added export format dbt-sources:
datacontract export --format dbt-sources
- added export format dbt-staging-sql:
datacontract export --format dbt-staging-sql
- added export format rdf:
datacontract export --format rdf
(#52) - added command
datacontract breaking
to detect breaking changes in between two data contracts.
- export to dbt models (#37).
- export to ODCS (#49).
- test - show a test summary table.
- lint - Support local schema (#46).
- Support for Postgres
- Support for Databricks
- Support for BigQuery data connection
- Support for multiple models with S3
- Fix Docker images. Disable builds for linux/amd64.
- Publish to Docker Hub
This is a breaking change (we are still on a 0.x.x version). The project migrated from Golang to Python. The Golang version can be found at cli-go
test
Support to directly run tests and connect to data sources defined in servers section.test
generated schema tests from the model definition.test --publish URL
Publish test results to a server URL.export
now exports the data contract so format jsonschema and sodacl.
- The
--file
option removed in favor of a direct argument.: Usedatacontract test datacontract.yaml
instead ofdatacontract test --file datacontract.yaml
.
model
is now part ofexport
quality
is now part ofexport
- Temporary Removed:
diff
needs to be migrated to Python. - Temporary Removed:
breaking
needs to be migrated to Python. - Temporary Removed:
inline
needs to be migrated to Python.
- Support local json schema in lint command.
- Update to specification 0.9.2.
- Fix format flag bug in model (print) command.
- Log to STDOUT.
- Rename
model
command parameter,type
->format
.
- Remove
schema
command.
- Fix documentation.
- Security update of x/sys.
- Adapt Data Contract Specification in version 0.9.2.
- Use
models
section fordiff
/breaking
. - Add
model
command. - Let
inline
print to STDOUT instead of overwriting datacontract file. - Let
quality
write input from STDIN if present.
- Basic implementation of
test
command for Soda Core.
- Change package structure to allow usage as library.
- Fix field parsing for dbt models, affects stability of
diff
/breaking
.
- Fix comparing order of contracts in
diff
/breaking
.
- Handle non-existent schema specification when using
diff
/breaking
. - Resolve local and remote resources such as schema specifications when using "$ref: ..." notation.
- Implement
schema
command: prints your schema. - Implement
quality
command: prints your quality definitions. - Implement the
inline
command: resolves all references using the "$ref: ..." notation and writes them to your data contract.
- Allow remote and local location for all data contract inputs (
--file
,--with
).
- Add
diff
command for dbt schema specification. - Add
breaking
command for dbt schema specification.
- Suggest a fix during
init
when the file already exists. - Rename
validate
command tolint
.
- Remove
check-compatibility
command.
- Improve usage documentation.
- Initial release.