- Machine Learning: The High Interest Credit Card of Technical Debt
- The Care & Feeding of Data Scientists
- Designing Data-Intensive Applications
- Storytelling with Data: A Data Visualization Guide for Business Professionals
- redshift
- snowflake
- bigquery
- vertica
- databricks delta
- https://github.com/apache/iceberg
- https://github.com/apache/hudi
- https://www.dremio.com/
- kafka
- flink
- beam
- elasticsearch
- rockset
- jupyter
- https://zeppelin.apache.org/
- https://github.com/ironmussa/Optimus
- https://colab.research.google.com/
- https://www.saturncloud.io/
- https://rmarkdown.rstudio.com/
- https://www.querybook.org/
- https://deepnote.com/
- https://cnvrg.io/
- dominodatalab
- airflow
- luigi
- oozie
- https://github.com/dagster-io/dagster
- https://github.com/PrefectHQ/prefect
- https://www.getdbt.com/product/
- streamset
- nifi
- snowplow
- https://www.prefect.io/
- https://www.ascend.io/
- https://www.datmo.com/
- TFX-OSS
- segment
- Rudderstack
- Metarouter
- snowplow
- https://github.com/airbnb/knowledge-repo
- https://github.com/lyft/amundsen
- https://eng.uber.com/databook/
- https://medium.com/netflix-techblog/metacat-making-big-data-discoverable-and-meaningful-at-netflix-56fb36a53520
- https://cloud.google.com/data-catalog/
- https://github.com/linkedin/datahub
- airbnb: democratizing data, airbnb: scaling knowledge
- superset
- looker
- periscope
- fivetran
- singer.io
- talend
- https://github.com/airbytehq/airbyte
- https://github.com/linkedin/brooklin
- query CSVs w/ SQL: cq
- jq
- https://www.datagrail.io/
- immuta
- privacera
- ethyca
- osano