-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Imaging Data Commons #1450
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
psavery
force-pushed
the
dicomweb-idc-support
branch
4 times, most recently
from
February 1, 2024 19:28
683443b
to
218eed5
Compare
psavery
force-pushed
the
dicomweb-idc-support
branch
from
February 6, 2024 18:02
218eed5
to
f0e3617
Compare
Even though we still need imi-bigpicture/wsidicom#149 to support IDC, this is ready for review anyways, because it doesn't make any breaking changes. |
imi-bigpicture/wsidicom#149 was merged, so this is definitely ready! |
psavery
force-pushed
the
dicomweb-idc-support
branch
2 times, most recently
from
February 13, 2024 16:08
8f650e5
to
44b4db6
Compare
Google Healthcare API, used by Imaging Data Commons, does not allow filtering by SOPClassUID. So we cannot use that in the search filter. We should think of alternatives so that we can include only WSI results. These changes were needed alongside [this wsidicom PR](imi-bigpicture/wsidicom#149) in order to view an example dataset. The following were used for testing: url = 'https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb' study_uid = '2.25.25644321580420796312527343668921514374' series_uid = '1.3.6.1.4.1.5962.99.1.3205815762.381594633.1639588388306.2.0' Signed-off-by: Patrick Avery <[email protected]>
These work well for importing examples from the Imaging Data Commons. It takes a while to import, so a limit of 10 series seems reasonable for now. Also, it's good to have a default filter of "Modality": "SM", because otherwise, we would mostly receive non-WSI series. Signed-off-by: Patrick Avery <[email protected]>
We were previously just performing a `search_for_series()` and applying the limit and filters to that search. However, it is probably more intuitive for users to be searching for studies, rather than series. So we are now performing a `search_for_studies()` first, applying the limit and filters to this, and then locating all series within those studies, and proceeding from there. Signed-off-by: Patrick Avery <[email protected]>
We are now searching by studies, not series. The tests need to be fixed to take this into account. Signed-off-by: Patrick Avery <[email protected]>
Signed-off-by: Patrick Avery <[email protected]>
Importing the first 10 studies on IDC has been taking around 20 minutes. About 65% of that time has been spent on inferring file sizes. Even though we don't stream the file data, the request from which we get the content length must be taking some time on the server. Skip doing this for now. We can add it back in if we figure out a way to make it much faster. Signed-off-by: Patrick Avery <[email protected]>
psavery
force-pushed
the
dicomweb-idc-support
branch
from
February 13, 2024 17:48
44b4db6
to
e471abe
Compare
manthey
approved these changes
Feb 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The NCI's Imaging Data Commons is a big repository (>38k studies) for cancer research. For the DICOMweb server, it uses Google's Cloud Healthcare API behind a proxy. This DICOMweb server sometimes behaves differently than the dcm4chee server we have been testing with.
This PR fixes a couple of issues we encountered. One of which is that we cannot use the
SOPClassUID
as a search filter (even though the DICOMweb standard indicates that it should be supported when searching for instances). We can perform manual filtering instead, however.This PR also adds a default limit to the import page (which is required for importing from IDC), and a default search filter that specifies
SM
. Without the search filter, we end up importing a lot of non-WSI datasets. With theSM
search filter, most of the imported datasets look correct, and all are viewable.Also needed to support the IDC: imi-bigpicture/wsidicom#149