A data repository for pdx crime and real estate data, with Python code for cleansing and preparing the data to be joined by neighborhood.
Data sources:
- 2015 PDX real estate (waybackmachine)
- 2016 PDX real estate
- 2017 PDX real estate
- 2018 PDX real estate
- 2019 PDX real estate
- 2020 PDX real estate
- 2021 PDX real estate
- 2015-2021 PDX crime data
The raw csv files are checked into this repo and can be found in the data lake package. The "mixed" (cleansed and prepped) files are also checked into this repo and can be found in the data bar package. Tests that analyze data integrity can be found here and here.
- Install python3. The first article in the series linked above should get you started (he recommends
pyenv
). For example, if usingpyenv
, run "pyenv local 3.9.2" (if using python v 3.9.2). - Install
poetry
; see the project homepage or this article. - Build the project. If you prefer make, you can run:
make deps
This will run poetry install
and poetry run nox --install-only
. You can run make help
to see more make targets. Alternatively, you can just run poetry
's CLI; see the Makefile's make targets for inspiration.
make clean
Will clean out your install.
This file has some standard config files:
- The overall project is configured via a PEP518 pyproject.toml file. If you fork this repo, you should probably change it. It contains the black settings, the project dependencies, a pytest configuration, and a
- the .gitignore contains obvious gitignores.
- the noxfile.py contains nox targets for running
safety
and yourtests
. It uses the nox-poetry project for nox-poetry integration. - The .flake8 has a minimal flake8 configuration.
- The mypy.ini has a minimal mypy configuration.