Add support for Imaging Data Commons #1450

Google Healthcare API, used by Imaging Data Commons, does not allow filtering by SOPClassUID. So we cannot use that in the search filter. We should think of alternatives so that we can include only WSI results. These changes were needed alongside [this wsidicom PR](imi-bigpicture/wsidicom#149) in order to view an example dataset. The following were used for testing: url = 'https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb' study_uid = '2.25.25644321580420796312527343668921514374' series_uid = '1.3.6.1.4.1.5962.99.1.3205815762.381594633.1639588388306.2.0' Signed-off-by: Patrick Avery <[email protected]>

These work well for importing examples from the Imaging Data Commons. It takes a while to import, so a limit of 10 series seems reasonable for now. Also, it's good to have a default filter of "Modality": "SM", because otherwise, we would mostly receive non-WSI series. Signed-off-by: Patrick Avery <[email protected]>

We were previously just performing a `search_for_series()` and applying the limit and filters to that search. However, it is probably more intuitive for users to be searching for studies, rather than series. So we are now performing a `search_for_studies()` first, applying the limit and filters to this, and then locating all series within those studies, and proceeding from there. Signed-off-by: Patrick Avery <[email protected]>

We are now searching by studies, not series. The tests need to be fixed to take this into account. Signed-off-by: Patrick Avery <[email protected]>

Signed-off-by: Patrick Avery <[email protected]>

Importing the first 10 studies on IDC has been taking around 20 minutes. About 65% of that time has been spent on inferring file sizes. Even though we don't stream the file data, the request from which we get the content length must be taking some time on the server. Skip doing this for now. We can add it back in if we figure out a way to make it much faster. Signed-off-by: Patrick Avery <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Imaging Data Commons #1450

Add support for Imaging Data Commons #1450

Commits on Feb 13, 2024