Skip to content

ebraimcarvalho/my-way

Repository files navigation

  • Active learning!!!
  • Read the code
  • Manage time
  • Block of study
  • Google and documentation things

Data Engineering make data available for the end-user, for the purposes of analyttics, model building, app development, etc.

MOVE/STORE: Reliable Data FLow, Infrastructure, Pipelines, ETL, Structured and Unstructured data storage;

EXPLORE/TRANSFORM: Cleaning, anomaly detection, prep.

Common Activities

  • Ingest Data from a data source
  • Build and mantain a data warehouse
  • Create a data pipeline
  • Create an analytics table for a specific use case
  • Migrate data to the cloud
  • Schedule and automate pipelines
  • Backfill data
  • Debug data quality issue
  • Optimize queries
  • Design a database

Engenharia de Dados:

  • Python
  • SQL
  • Modelagem de Data Warehouse
  • Modelagem Multidimensional
  • Formas de Normalização
  • Linux e Cronjobs
  • Engenharia de Software para criação das rotinas de orquestração
  • Ferramenta de orquestração como Airflow ou Luigi seria bom tambem

https://github.com/igorbarinov/awesome-data-engineering

  1. Databases

  2. Batch Processing

  3. File System

  4. Modelagem de banco de dados (Transacional e OLAP)

  5. Stream Processing

  6. Testing

Livros:

  1. Beginning Databases With PostgreSQL - From Novice To Professional
  2. Stream Processing with Apache Spark: Mastering Structured Streaming and Spark Streaming
  3. Spark: The Definitive Guide: Big Data Processing Made Simple
  4. PostgreSQL: Up and Running: A Practical Guide to the Advanced Open Source Database
  5. Hadoop: The Definitive Guide, 4th Edition: Storage and Analysis at Internet Scale
  6. The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
  7. Redis in Action
  8. The data warehouse ETL toolkit : practical techniques for extracting, cleaning, conforming, and delivering data
  9. The Definitive Guide to SQLite (Second Edition)
  10. The Data Warehouse ETL Toolkit : Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data
  11. Using SQLite
  12. Refactoring Databases: Evolutionary Database Design (Addison-Wesley Signature Series (Fowler))

Cursos:

1.Udacity nanodegree

Certificações

  1. De Cloud

Projetos com profundidade por tecnologia/ferramenta

Escrever artigos e snippets do conhecimento estudado

Data Hackers Slack and Blog

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published