-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DAR-4036][External] E2E tests for push
and some coverage for import
#928
Conversation
Wiz Scan Summary
To detect these findings earlier in the dev lifecycle, try using Wiz Code VS Code Extension. |
e2e_tests/cli/test_import.py
Outdated
for annotation in actual_annotations | ||
if annotation.data == expected_annotation_data | ||
and annotation.annotation_class.annotation_type | ||
== expected_annotation_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary because the data
field of both the tag
and mask
types is just {}
, so we need to check that annotation_type
matches too
# Prefix the command with 'poetry run' to ensure it runs in the Poetry shell | ||
command = f"poetry run {command}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes running E2Es locally less error prone by forcing them to run in the poetry shell that's guaranteed to point to the correct darwin-py
installation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, python also has sys.executable 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know this, it's much nicer. thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really excited for these e2e!
e2e_tests/cli/test_convert.py
Outdated
@@ -0,0 +1,33 @@ | |||
# from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see there are TODO's also in other files, let's make sure to clean them up before merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed all the TODOs. There are for future E2E PRs so don't have to be present here
e2e_tests/cli/test_import.py
Outdated
tmp_dir: Path, import_dir: Path, appending: bool = False | ||
): | ||
""" | ||
Validate that the annotations downloaded from a release match the annotations in | ||
a particular directory, ignoring hidden files. | ||
|
||
If `appending` is set, then the number of actual annotations should exceed the | ||
number of expected annotations | ||
""" | ||
annotations_dir = tmp_dir / "annotations" | ||
with zipfile.ZipFile(tmp_dir / "dataset.zip") as z: | ||
z.extractall(annotations_dir) | ||
expected_annotation_files = { | ||
file.name: str(file) | ||
for file in import_dir.iterdir() | ||
if file.is_file() and not file.name.startswith(".") | ||
} | ||
actual_annotation_files = { | ||
file.name: str(file) | ||
for file in annotations_dir.iterdir() | ||
if file.is_file() and not file.name.startswith(".") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read validate_downloaded_annotations
what can I expect tmp_dir
to be? Is tmp_dir
the expected or the actual result? This is of course more visible to me, not knowing the internals, whilst I understand it might be obvious to you.
I'm thinking that we should either stick to export/import naming or actual/expected. I mean it looks like we're using both naming conventions within the same function and IMO this might be confusing.
eg.
def validate_downloaded_annotations(
export_dir: Path, import_dir: Path, appending: bool = False
):
or
def validate_downloaded_annotations(
actual_annotations_dir: Path, expected_annotations_dir: Path, appending: bool = False
):
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also think about compare_annotations_export(export_dir, import_dir.....
.
I'm nitpicking over naming, just cause it make it easier for me to read the code, but up to you to evaluate these comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is all sensible feedback, no reason not to improve the naming convention for others who will work on these tests in the future
e2e_tests/cli/test_import.py
Outdated
actual_annotation_files = { | ||
file.name: str(file) | ||
for file in annotations_dir.iterdir() | ||
if file.is_file() and not file.name.startswith(".") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't have to be in the with
right?
e2e_tests/cli/test_import.py
Outdated
# Delete generated UUIDs as these will break asserting equality | ||
for annotation in expected_annotations: | ||
del [annotation.id] # type: ignore | ||
if annotation.annotation_class.annotation_type == "raster_layer": | ||
del [annotation.data["mask_annotation_ids_mapping"]] # type: ignore | ||
for annotation in actual_annotations: | ||
del [annotation.id] # type: ignore | ||
if annotation.annotation_class.annotation_type == "raster_layer": | ||
del [annotation.data["mask_annotation_ids_mapping"]] # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my rule of thumb: if I need to add a comment to explain a section, then that section is a method. e.g. delete_annotatoins_uuids
e2e_tests/cli/test_import.py
Outdated
expected_annotation_data = expected_annotation.data | ||
expected_annotation_type = ( | ||
expected_annotation.annotation_class.annotation_type | ||
) | ||
actual_annotation = next( | ||
( | ||
annotation | ||
for annotation in actual_annotations | ||
if annotation.data == expected_annotation_data | ||
and annotation.annotation_class.annotation_type | ||
== expected_annotation_type | ||
), | ||
None, | ||
) | ||
assert ( | ||
actual_annotation is not None | ||
), "Annotation not found in actual annotations" | ||
|
||
# Properties must be compared separately because the order of properties | ||
# is a list with variable order. Differences in order will cause assertion failure | ||
if expected_annotation.properties: | ||
assert actual_annotation.properties is not None | ||
expected_properties = expected_annotation.properties | ||
del expected_annotation.properties | ||
actual_properties = actual_annotation.properties | ||
del actual_annotation.properties | ||
for expected_property in expected_properties: | ||
assert expected_property in actual_properties | ||
assert expected_annotation == actual_annotation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd split this into assert_same_annotation_data
and assert_same_annotations_properties
. Ideally: this function reads more smoothly, no need of comments and then I can check the sub-fuctions separately without keeping all the context in mind
e2e_tests/cli/test_import.py
Outdated
local_dataset: E2EDataset, config_values: ConfigValues | ||
) -> None: | ||
""" | ||
Test importing a set of basic annotations (no sub-types or properties) to a set of pre-registered files in a dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
basic annotations
I think this is an arbitrary concept. I see what you are trying to do and I don't have a better way to define this anyways, so I think the doc is very useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reading below I'm thinking if this should just be test_import_annotations_without_subtypes_to_images
e2e_tests/cli/test_import.py
Outdated
with tempfile.TemporaryDirectory() as tmp_dir_str: | ||
tmp_dir = Path(tmp_dir_str) | ||
export_and_download_annotations(tmp_dir, local_dataset, config_values) | ||
validate_downloaded_annotations(tmp_dir, import_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice and clean 💯
e2e_tests/cli/test_import.py
Outdated
Test that appending annotations to an item with already existing annotations does not overwrite the original annotations | ||
""" | ||
local_dataset.register_read_only_items(config_values) | ||
import_dir = ( | ||
Path(__file__).parents[1] / "data" / "import" / "image_basic_annotations" | ||
) | ||
# 1st import to create annotations | ||
result = run_cli_command( | ||
f"darwin dataset import {local_dataset.name} darwin {import_dir}" | ||
) | ||
assert_cli(result, 0) | ||
# 2nd import to append more annotations | ||
result = run_cli_command( | ||
f"darwin dataset import {local_dataset.name} darwin {import_dir} --append" | ||
) | ||
assert_cli(result, 0) | ||
with tempfile.TemporaryDirectory() as tmp_dir_str: | ||
tmp_dir = Path(tmp_dir_str) | ||
export_and_download_annotations(tmp_dir, local_dataset, config_values) | ||
validate_downloaded_annotations(tmp_dir, import_dir, appending=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see.
Or: we could import image_basic_annotations
in two steps (half and half) (I don't think this is super easy as truncating the file in the mid). This would allow us to validate without appending=true
cause we'd known exactly that we expect the full annotations in image_basic_annotations
in the export.
Or: we could split image_basic_annotations
into two smaller files to begin with.
The only reason I'm thinking this is that validate_downloaded_annotations
only checks that the export is bigger than the import but it's not a super-strict check on the fact that (in this case) they have to be exactly double. Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense - What I can do is create a new import_dir
specifically for this test containing 2 files (half & half), then I can remove the appending
concept from validate_download_annotations)
@@ -0,0 +1,112 @@ | |||
from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
beautiful tests
e2e_tests/cli/test_import.py
Outdated
/ "image_annotations_split_in_two_files" | ||
) | ||
result = run_cli_command( | ||
f"darwin dataset import {local_dataset.name} darwin {expected_annotations_dir}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this include --append?
e2e_tests/cli/test_import.py
Outdated
f"darwin dataset import {local_dataset.name} darwin {expected_annotations_dir}" | ||
) | ||
assert_cli(result, 0) | ||
assert_cli(result, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
double assert?
Problem
darwin-py's E2E tests are sparse
Solution
Add a
e2e_tests/cli/test_push.py
file with the following tests:test_push_mixed_filetypes
- Test pushing a directory of files containing various filetypes and verify they finish processingtest_push_nested_directory_of_images
- Test pushing a nested directory structure of some images with thepreserve_folders
flag. Verify they finish processing and they end up in the correct remote pathstest_push_videos_with_non_native_fps
- Test that if FPS is set, that the value is respected in the resulting video itemsThese tests will wait a maximum of 10 minutes for all items to finish processing. If this timeout is exceeded, the test will fail
Add a
e2e_tests/cli_test_import.py
file with the following tests:test_import_basic_annotations_to_images
- Test importing a set of basic image annotations (no sub-types or properties) to a set of pre-registered files in a datasettest_import_annotations_with_subtypes_to_images
- Test importing a set of image annotations including subtypes & properties to a set of pre-registered files in a datasettest_annotation_classes_are_created_on_import
- Test that importing non-existent annotation classes creates those classes in the target Darwin teamtest_annotation_classes_are_created_with_properties_on_import
- Test that importing non-existent annotation classes with properties creates those classes and properties in the target Darwin teamtest_appending_annotations
- Test that appending annotations to an item with already existing annotations does not overwrite the original annotationstest_overwriting_annotations
- Test that the--overwrite
flag allows bypassing of the overwrite warning when importing to items with already existing annotationstest_annotation_overwrite_warning
- Test that importing annotations to an item with already existing annotations throws a warning if not using the--append
or--overwrite
flagsChangelog
Expanded darwin-py E2E