Pipeline integration testing #83

zlochina · 2024-09-20T10:42:55Z

Created simple integration tests based on example .osm file

…sage

zlochina · 2024-09-20T10:49:07Z

@F-I-D-O, i've got an issue with integration tests, which im not really sure how to solve. The integration tests work fine if the tables are clear (or at least do not influence test data), but im afraid that these tests shoulnd't be run, if the user has already uploaded some data (these tests could lead to modifications of original data)

F-I-D-O · 2024-09-20T12:01:37Z

@F-I-D-O, i've got an issue with integration tests, which im not really sure how to solve. The integration tests work fine if the tables are clear (or at least do not influence test data), but im afraid that these tests shoulnd't be run, if the user has already uploaded some data (these tests could lead to modifications of original data)

The test should be run in a test scheme, just like the SQL unit tests. The scheme should be initialized empty before any test.

F-I-D-O · 2024-09-20T12:02:12Z

Additionally, integration tests should contain tests testing the whole pipeline at once

zlochina · 2024-09-21T13:07:38Z

@F-I-D-O, i've got an issue with integration tests, which im not really sure how to solve. The integration tests work fine if the tables are clear (or at least do not influence test data), but im afraid that these tests shoulnd't be run, if the user has already uploaded some data (these tests could lead to modifications of original data)

The test should be run in a test scheme, just like the SQL unit tests. The scheme should be initialized empty before any test.

UPD: resolved

zlochina · 2024-09-21T13:08:24Z

Additionally, integration tests should contain tests testing the whole pipeline at once

they were already. From setting up the tables up to contraction/computing strongly connected components

zlochina · 2024-10-09T14:53:45Z

@F-I-D-O, would like your review on assertion of the export. I've done it in such way, that I've computed resulted files' md5 hashes and they are asserted against newly generated files at each execution of the test.

Also, I guess I need to explain how database is initialised. At first, addition of the tables is done into default schema public (or not, depending whether table areas is present), then psql function test_env_constructor is called in order to copy the tables into test_env schema. Only after this test osm data are uploaded into test_env and integration testing begins. The reason I'm mentioning this, is that I've met the problem when cleaning up after execution of the test - in case, the default schema had no tables, we need to delete these tables when cleaning up, but if there were already tables, we need not. I guess solution to this problem is initialisation of the database already in test_env schema, but for that we need to rewrite scripts/install_sql.py to process initialisation with choice of the schema

zlochina · 2024-10-09T15:44:55Z

@F-I-D-O, also while researching export flow I encountered one suspicious to me thing. In order to export the data we need to use some kind of config, which is something like this:

---
area_dir: ../../../../../
area_id: 1
map:
  SRID: 4326
  SRID_plane: 32618
  path: dasdsada

In the integration test im creating variable dictionary for export, but it does not seem right to me.

F-I-D-O · 2024-10-11T11:43:31Z

@F-I-D-O, would like your review on assertion of the export. I've done it in such way, that I've computed resulted files' md5 hashes and they are asserted against newly generated files at each execution of the test.

Also, I guess I need to explain how database is initialised. At first, addition of the tables is done into default schema public (or not, depending whether table areas is present), then psql function test_env_constructor is called in order to copy the tables into test_env schema. Only after this test osm data are uploaded into test_env and integration testing begins. The reason I'm mentioning this, is that I've met the problem when cleaning up after execution of the test - in case, the default schema had no tables, we need to delete these tables when cleaning up, but if there were already tables, we need not. I guess solution to this problem is initialisation of the database already in test_env schema, but for that we need to rewrite scripts/install_sql.py to process initialisation with choice of the schema

Hashes are not a good solution:

for binary files, the format could change in the future, and then the tests need to be updated,
for text files, it is useful to compare the file content and see the diff

I suggest keeping the hashes now for binary files and comparing text files as text.

Concerning the second issue, I think it would be to best to modify install_sql.py so that the schema can be configured. There is already an issue for that.

F-I-D-O · 2024-10-11T11:45:14Z

@F-I-D-O, also while researching export flow I encountered one suspicious to me thing. In order to export the data we need to use some kind of config, which is something like this:
---
area_dir: ../../../../../
area_id: 1
map:
  SRID: 4326
  SRID_plane: 32618
  path: dasdsada
In the integration test im creating variable dictionary for export, but it does not seem right to me.

I think we should check if we need all of these configs and if yes, we should use a config file like this ( I think we will setup some config file with Marek...)

zlochina · 2024-11-20T14:31:02Z

As per the request made from our last meeting, I'm including summary of the updates by this Pull Request:

Integration test. Added python/tests/integrations_test.py file to contain all integration tests for the project. Currently there is 1 test called test_integration_base_flow, which tests such flow:
1. Setting up.
  1. If there are no tables in the public schema (which is checked only by existence of the areas table), call pre_processing function, which is main function imported from scripts.install_sql module.
  2. As the previous step added needed tables, the postgres test_env_constructor() procedure is called (which is actually a procedure from SQL/testing_extension.sql file). This procedure copies every table and sequences from public to test_env.
  3. OSM data is imported to TEST_SCHEMA, which is test_env.
  4. Area corresponding to OSM file is inserted.
  5. OSM data are imported to the python dictionary.
2. Assertions.
  1. Assertions of correct import of OSM data.
    1. Assertion of nodes, comparing local nodes to nodes from TEST_SCHEMA.
    2. Assertion of ways, ...
    3. Assertion of nodes_ways, ...
    4. Assertion of area, ...
  2. Test of contract_graph_in_area
    1. Assertion of contracted nodes against hardcoded expected result
    2. Assertion of edges against hardcoded expected result
  3. Test of compute_strong_components
    1. Assertion of component data against hardcoded expected result
  4. Test of export to files.
    1. All files are checked by md5 hashing of each of the files
3. Cleaning up.
  1. fixture cleanup_for_base_flow() is called, when function has finished execution. It:
    1. removes all created files by export procedure
    2. postgres procedure test_env_destructor() is called to remove test_env schema with all testing data.
4. Space for improving:
  - Update check of table existence. Update either to the state of checking for every needed table or come up with another setting up tables to the test_env schema.
  - Both test_env_constructor, test_env_destructor could accept optional argument with the name of the testing schema. We can pass python module variable TEST_SCHEMA.
  - Current implementation of assertion of the exported files is rather arguable. As discussed in PR, it could be better to rather check content of the file than its hash for the non-binary files.
Other files. Beside formatting the python files, these non-trivial changes were done:
1. roadgraphtool/db.py.
  - Function execute_sql_and_fetch_all_rows() now works with dynamic schema.
2. Python wrappers of postgres procedures were moved from scripts/main.py to roadgraphtool/db_operations.py.
3. roadgraphtool/insert_area.py.
  - Function insert_area() now works with dynamic schema.
4. roadgraphtool/map.py.
  - Both functions _get_map_from_db() and get_map() work with dynamic schema.
5. scripts/prcoess_osm.py.
  - Resolved bug in setup_ssh_tunnel() described in issue
6. SQL/main.sql.
  - Attributes public.nodes.id and public.ways.id, public.ways."from", public.ways."to" have updated datatype from bigint to integer.

zlochina added 13 commits September 9, 2024 16:30

Moved py wrappers from main.py to db_operations.py

030185a

Resolved rebase conflict of 030185a

8b293a5

Updated .gitignore to ignore Poetry files

e566169

Added post-processing py wrapper

8afac5d

Updated insert_area function in main.py

3f40923

Updated default path for style file

d719dd0

Reformatted test_filter_osm.py

a365b1d

test_process_osm.py: reformatted, resolved tests issue with pathlib u…

846c7fd

…sage

Fixed insert_area.py, process_osm.py

a6c1479

Added test data for integration tests

a822faa

Updated test data to support contraction

2c1e829

Added integration tests

c208cdd

Test checks if the pre-processing was already done

a19b884

zlochina requested a review from F-I-D-O September 20, 2024 10:42

zlochina added 2 commits September 20, 2024 14:25

Removed dev notes

c79afaf

Updated assertion method

16048db

This was linked to issues Sep 20, 2024

Move python wrappers for sql functions/procedures from main.py to for example utils.py #73

Open

Test for the whole (minimal) pipeline #61

Open

Renamed file error #80

Closed

Updated integration tests to run in test_env of db

b5c6ee3

zlochina added 5 commits September 21, 2024 16:09

Removed debugging lines

5ffb4a2

Added import state test

e4f9548

Removed debugging lines

ee9cd8b

Updated structure of integration test of the base flow

e7ba1da

Added config to base integration test

2ae8000

zlochina added 4 commits October 9, 2024 15:02

added todos

a9364ae

Added assertions of the created files of the base flow

7fe71d2

Added cleanup after base_flow

a4bd826

Fixed call of cleanup in base_flow_integration_test

ffb5144

zlochina added 5 commits October 28, 2024 14:43

Merged main into pipeline-integration-testing

009fdfb

Merged main into pipeline-integration-testing. Additional files

2cd2acc

Merged main into pipeline-integration-testing

fc58cab

Resolved issue #97. Wrong overwrite of db port

55f18f3

Added support of dynamic schema in multiple functions

c73b67a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline integration testing #83

Pipeline integration testing #83

zlochina commented Sep 20, 2024

zlochina commented Sep 20, 2024

F-I-D-O commented Sep 20, 2024

F-I-D-O commented Sep 20, 2024

zlochina commented Sep 21, 2024

zlochina commented Sep 21, 2024

zlochina commented Oct 9, 2024

zlochina commented Oct 9, 2024

F-I-D-O commented Oct 11, 2024

F-I-D-O commented Oct 11, 2024

zlochina commented Nov 20, 2024

Pipeline integration testing #83

Are you sure you want to change the base?

Pipeline integration testing #83

Conversation

zlochina commented Sep 20, 2024

zlochina commented Sep 20, 2024

F-I-D-O commented Sep 20, 2024

F-I-D-O commented Sep 20, 2024

zlochina commented Sep 21, 2024

zlochina commented Sep 21, 2024

zlochina commented Oct 9, 2024

zlochina commented Oct 9, 2024

F-I-D-O commented Oct 11, 2024

F-I-D-O commented Oct 11, 2024

zlochina commented Nov 20, 2024