SQL is used across the machine learning pipeline, and is a fundamental skill for data scientists to master. This module will focus on the technical skills needed for working with SQL, including flat-file datasets (JSON, CSV) ingestion, query design, and relational database management. Additionally, participants will examine common data management concerns, data access management, and data privacy adherence.
Week 1:
Describe the structure of a database. Use an export command to save and transport data in CSV and JSON file formats. Use SQL querying and data manipulation techniques to formulate queries for a range of purposes.
Week 2:
Examine the legal framework around sharing data. Analyze data requirements and work with diverse stakeholders such as analysts and managers. Use advanced techniques such as String Manipulation, and NULL Management to manipulate results.
Folder Structure . ├── .github ├── .gitignore ├── 01_materials ├── 02_activities ├── 03_instructional_team ├── 04_this_cohort ├── 05_src ├── LICENSE ├── README.md └── steps_to_ask_for_help.png .github: Contains issue templates, pull request templates and workflows for the repository. materials: Module slides used during learning sessions. activities: Contains graded assignments, and rubrics for evaluating assignments. instructional_team: Resources for the instructional team. this_cohort: Additional materials and resources for cohort three. src: Source code, databases, logs, and required dependencies (requirements.txt) needed during the module. .gitignore: Files to exclude from this folder, specified by the Technical Facilitator LICENSE: The license for this repository. README: This file. steps_to_ask_for_help.png: Guide on how to ask for help.