This repository contains a template DBT project that can be used for DBT development on Snowflake.
The setup script of this repo uses Python Poetry to handle package management and create your virtual environments for development.
- It is recommended that you work in a Unix environment (i.e. MacOS or WSL2 on Windows)
- This repository assumes your Snowflake environments are set up in a particular structure:
- You work in a single instance of Snowflake (e.g. the Prod instance)
- Your 'environments' (dev, uat, prod) are simply different databases in the Prod instance
- The PROJECT_NAME you supply is used as a prefix to name most of your Snowflake infrastructure resources (database, roles, warehouses)
- You have a separate Database and associated infra for dev, uat and prod e.g. if your project name is "JAFFLE SHOP", the dbt profile will be set up assuming the following:
- Your dev database should be called JAFFLE_SHOP_DEV
- Your dev warehouse should be called JAFFLE_SHOP_DEV_WH
- Your dev service account user should be called JAFFLE_SHOP_DEV_SA
- Your dev admin role should be called JAFFLE_SHOP_DEV_ADMIN
- Your service accounts can use Private keys for authentication
- If Private keys have not been set up, you will need to update the dbt
profiles.yml
file to make sure the non development targets use passwords instead
- If Private keys have not been set up, you will need to update the dbt
- You must have Python installed on your machine
- For MacOS you can install Python directly from the website:
You can begin these steps once you meet all above prerequisites
-
Create a remote repository on GitHub. Do not choose any templates during creation. You will be prompted for the URL of the repo during step 4.
-
Install
cruft
usingpipx
pipx install cruft
-
Use
cruft
to instantiate this template on your local machineBy default, cruft will create the template where your terminal is. Use the
cd
command to navigate to the directory you want your project to be in. For example, if you want to keep the project underDocuments
on Mac, usecd ~/Documents
Once you are in the directory that you want to create the template in, run the following command:
cruft create https://github.com/Armalite/beautiful-dbt-template
Fill in the information that
cruft
prompts you for:- Project Name: The full name of your project. This is used in
documentation and to generate a project slug, which for a given "Project
Name" looks like "project-name". This is also used to generate database names
for your dbt
profiles.yml
as well as your repo folder - Description: A short description which is used in generated documentation.
- Team Name: The name of your team (leave blank or invent one if not available)
- Team Email: Team email of your team (leave blank or invent one if not available)
- AUTHOR_NAME: Name of the project author
- SNOWFLAKE_ACCOUNT: The Snowflake account you want your DBT project to connect to.
- USER_NAME: The Snowflake username you will connect
- Project Name: The full name of your project. This is used in
documentation and to generate a project slug, which for a given "Project
Name" looks like "project-name". This is also used to generate database names
for your dbt
-
Next, use the installer inside the your new project folder to set up your repo and install basic dependencies.
First, navigate to the project directory that has just been created. For example, if your project is called
my-data-product
:cd my-data-product
Then execute the following command:
make install
This command does the following:
- Initialises a git repository for you
- Prompts you for the URL to your remote Github repo (created in step 1) so that it can connect to it
- Installs all the dependencies needed to run DBT
-
Restart your terminal/shell and ensure Poetry is installed
poetry --version
.
When inside your dbt/
folder you can run any dbt command within this virtual
environment by prefixing your command with poetry run
e.g.
poetry run dbt run --profiles-dir .
(Optional) This step is optional and only needed if you skipped providing the Github URL in step 4. If for some reason you did not provide the URL to your Github repo during step 4, you can still manually connect to your remote repository following the steps to "push an existing repository from the command-line". These steps should look something like:
git remote add origin <repo-url>
git branch -M main
git push -u origin main
For example, if my remote repository was
https://github.com/Bob/my-data-product
, the command would be:
git remote add origin https://github.com/Bob/my-data-product
git branch -M main
git push -u origin main
If poetry --version
does not work, try these steps in the following order until one of these steps succeed:
- Restart the terminal and try
poetry --version
again - Run
make install
again - Do a clean installation with
make clean install
- Force a poetry installation with
make force-install
If poetry --version
does not work, but you are sure poetry has been installed, then it is possible Poetry did not get added to your path.
- Check that Poetry has been install at a particular location:
where poetry
- You can add the above location to the path by appending your zsh or bash file (Depending on the shell you use) with:
export PATH="X:$PATH"
Replacing X with the location outputed bywhere poetry
- You can add the above location to the path by appending your zsh or bash file (Depending on the shell you use) with:
You can reinstall the Poetry virtual environment with the following command:
make clean install
Additional dependencies are specified by editing the pyproject.toml
file. This
file will not be changed by cruft
on updates.
After editing the file you can run poetry update
to align versions and
dependencies
The git repository created by the above steps will contain a README describing all the DBT targets