-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: run / tiltseries s3 data validation (zarr, mrc, mdoc, and metadata checks) #223
Conversation
…i/s3-data-validation-dataset-deposition-photos
…i/s3-data-validation-dataset-deposition-photos
… into daniel-ji/s3-data-validation-frames-gains
…ji/s3-data-validation-tiltseries
84da6a5
to
251214b
Compare
9ad946a
to
1deb4ab
Compare
for ex in exclude: | ||
tentatives = [tent for tent in tentatives if ex not in tent] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should be a part of the fixture that generates the run_names and not be done here. Which it already is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in a later pr, see #250
|
||
zarrays = header_data["zarrays"] | ||
for i, zarray in zarrays.items(): | ||
header = self.mrc_headers[mrc_file].header |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Could this header be outside the for loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed!
### BEGIN MDOC consistency tests ### | ||
def test_tiltseries_pixel_spacing_mdoc(self, tiltseries_metadata: Dict, tiltseries_mdoc: pd.DataFrame): | ||
"""Check that the tiltseries pixel spacing matches the MDOC data.""" | ||
assert len(set(tiltseries_mdoc["PixelSpacing"])) == 1 | ||
assert tiltseries_metadata["pixel_spacing"] == tiltseries_mdoc["PixelSpacing"][0] | ||
|
||
def test_tiltseries_image_size_mdoc(self, tiltseries_metadata: Dict, tiltseries_mdoc: pd.DataFrame): | ||
"""Check that the tiltseries image size matches the MDOC data.""" | ||
assert len(set(tiltseries_mdoc["ImageSize"])) == 1 | ||
assert tiltseries_metadata["size"]["x"] == tiltseries_mdoc["ImageSize"][0][0] | ||
assert tiltseries_metadata["size"]["y"] == tiltseries_mdoc["ImageSize"][0][1] | ||
|
||
### END MDOC consistency tests ### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the mdoc files are saved along with the tiltseries at the moment, they are associated to the frames. As, there is processing to go from frames to tiltseries, it is very unlikely for us to have the same Image size or pixel spacings. So, these two tests are more likely to fail and that is the expected behaviour.
Merge after #207.
See the tiltseries & runs sections of https://docs.google.com/document/d/1yMKM0DW9KRhlcYiBGPcR7oW0liGtUew6NAmBhMg5U3w/edit