-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data engineering template #45
Conversation
contrib/templates/data-engineering/base/databricks_template_schema.json
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/base/template/{{.project_name}}/README.md.tmpl
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/base/template/{{.project_name}}/scripts/test.py
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/base/template/{{.project_name}}/conftest.py
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/base/template/{{.project_name}}/conftest.py
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/assets/etl-pipeline/databricks_template_schema.json
Outdated
Show resolved
Hide resolved
...ta-engineering/assets/etl-pipeline/template/assets/{{.pipeline_name}}/explorations/README.md
Outdated
Show resolved
Hide resolved
...ta-engineering/assets/etl-pipeline/template/assets/{{.pipeline_name}}/explorations/README.md
Outdated
Show resolved
Hide resolved
...ering/assets/etl-pipeline/template/assets/{{.pipeline_name}}/{{.pipeline_name}}.job.yml.tmpl
Show resolved
Hide resolved
.../assets/etl-pipeline/template/assets/{{.pipeline_name}}/{{.pipeline_name}}.pipeline.yml.tmpl
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/base/template/{{.project_name}}/databricks.yml.tmpl
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/base/template/{{.project_name}}/databricks.yml.tmpl
Outdated
Show resolved
Hide resolved
...ib/templates/data-engineering/assets/etl-pipeline/template/assets/{{.pipeline_name}}/main.py
Outdated
Show resolved
Hide resolved
…hema.json Co-authored-by: Pieter Noordhuis <[email protected]>
...g/assets/etl-pipeline/template/assets/{{.pipeline_name}}/explorations/exploration.ipynb.tmpl
Show resolved
Hide resolved
...ib/templates/data-engineering/assets/etl-pipeline/template/assets/{{.pipeline_name}}/main.py
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/databricks_template_schema.json
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/template/{{.project_name}}/conftest.py
Show resolved
Hide resolved
contrib/templates/data-engineering/template/{{.project_name}}/conftest.py
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/template/{{.project_name}}/pyproject.toml
Outdated
Show resolved
Hide resolved
contrib/templates/data-engineering/template/{{.project_name}}/scripts/test.py
Outdated
Show resolved
Hide resolved
...ta-engineering/assets/etl-pipeline/template/assets/{{.pipeline_name}}/explorations/README.md
Show resolved
Hide resolved
contrib/templates/data-engineering/assets/job/databricks_template_schema.json
Outdated
Show resolved
Hide resolved
...ta-engineering/assets/etl-pipeline/template/assets/{{.pipeline_name}}/explorations/README.md
Show resolved
Hide resolved
...data-engineering/assets/etl-pipeline/template/assets/{{.pipeline_name}}/sources/dev/taxis.py
Outdated
Show resolved
Hide resolved
# conftest.py is used to configure pytest. | ||
# This file is in the root since it affects all tests through this bundle. | ||
# It makes sure all 'assets/*' directories are added to `sys.path` so that | ||
# tests can import them. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you have 2 entries in this directory, both with a transformations
directory?
I think this is contrary to what one would expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 will look into an alternative approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the latest incarnation, the pipeline name is included in the package path. We'll need to discuss this a bit further in the LakeFlow group and solicit feedback.
contrib/templates/data-engineering/template/{{.project_name}}/conftest.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unblocking merge.
This reverts commit 596ece0.
Summary
This adds a data engineering template.
uv run add-asset
can be used to add a pipeline to a projectuv run test
can be used to run all tests on serverless computeuv run pytest
can be used to run all tests on any type of compute as configured in the current IDE or .databrickscfg settingsHow to try this
You can give the template a try using
Note that each pipeline has a separate template. New pipelines can be added by using the short-hand
uv run add-asset
or by manually instantiating the pipeline template with