Skip to content

Commit

Permalink
Entry point and documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Benvii committed Apr 29, 2018
1 parent 1029e46 commit ba1c44b
Show file tree
Hide file tree
Showing 2 changed files with 62 additions and 6 deletions.
46 changes: 44 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,48 @@
# TODO
# pg-dump-filtered - Postgres partial dump

# Installation
This module was designed to do partial dump (filtered dump) on a set of tables taking in account
all their foreign_keys, so table they are referencing.
If you have a huge database and you want to extract a small set of data (with there dependencies in other tables) to work on it and them re-import the datas
this script is made for you.

Filtering is made using SQL where statement on a SELECT statement with all tables relations handeled as INNER or LEFT JOIN (depending on the nullability of the foreign_key).

The generated dump is a set of COPY statements with raw values, so that it can handle all type of values (dates, binary, postgis points ...).

## Installation
```bash
pip install -r requirements.txt
python setup.py install
```

## How to use it

### Command line interface

```bash
Usage:
pg-dump-filtered [options] <db-uri> <table-list>

Arguments:
db-uri URI of the postgres database, for instance : postgresql://pg_dump_test:pg_dump_test@localhost:5432/pg_dump_test
table-list List of the table that needs to be exported, separated by commas (related tables will automatically be exported).
Eg : 'table1,table2,table3'

Options:
-h --help Show this screen.
--filters=<SQL> SQL filters. Eg: mytable.mycol = 'value' AND myothertable.toto LIKE 'titi'
--ignored-constraints=<str> List of constraints to be ignored. Eg : "myconstraint,myotherconstraint"
--output=<str> Dump file path. [default: dump.sql]
```

```bash
pg-dump-filtered "postgres://user:pwd@host/db" "tableA,tableB" --debug --filters="mytable.id=85 AND ....." --ignored-constraints="a_circular_constraint_name"
```

### Python Interface

For [Open Path View](https://openpathview.fr) whe needed to export small set a data depending on their geolocalisation and list some row of the exported datas (files UUID as files where saved in the database).

```python
TODO
```
22 changes: 18 additions & 4 deletions pg_dump_filtered/pg_dump_filtered.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
import logging
import psycopg2
from urllib.parse import urlparse
from typing import List
from typing import List, Tuple

from pg_dump_filtered.helpers import SchemaUtils, RequestBuilder, DumpBuilder

Expand Down Expand Up @@ -145,11 +145,12 @@ def request_builder(self, request_builder):
"""
self._request_builder = request_builder

def dump(self, tables_to_export: List[str]):
def generate_tables_to_request_and_join(self, tables_to_export: List[str]) -> Tuple[List[str], str]:
"""
Dump some tables and all related datas. Dump to the output file directly.
Generate a tuple with tables to request and join statement.
:param table_to_export: List of tables names that needs to be exported and all their related tables.
:param tables_to_export: Table names to be exported.
:return: A tuple (tables_to_requests, join_statement)
"""
tables_to_request = self.schema_utils.list_all_related_tables(table_names=tables_to_export)

Expand All @@ -161,6 +162,19 @@ def dump(self, tables_to_export: List[str]):
join_req = self.request_builder.generate_join_statments(table_names=tables_to_request, exclude_from_statment=[from_table_name])
self.logger.debug("Join request : %s", join_req)

return (tables_to_request, join_req)

def dump(self, tables_to_export: List[str]):
"""
Dump some tables and all related datas. Dump to the output file directly.
:param table_to_export: List of tables names that needs to be exported and all their related tables.
"""
self.logger.debug("Dump generation from %s")
tables_to_request, join_req = self.generate_tables_to_request_and_join(tables_to_export=tables_to_export)

from_table_name = tables_to_export[0] # Table that will be used in the FROM statment

# generating select statements
selects = self.request_builder.generate_all_select_statements(
table_to_be_exported=tables_to_request,
Expand Down

0 comments on commit ba1c44b

Please sign in to comment.