Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Imaging Data Commons #1450

Merged
merged 6 commits into from
Feb 15, 2024

Commits on Feb 13, 2024

  1. Do not filter by SOPClassUID

    Google Healthcare API, used by Imaging Data Commons, does not allow
    filtering by SOPClassUID. So we cannot use that in the search filter.
    We should think of alternatives so that we can include only WSI results.
    
    These changes were needed alongside [this wsidicom PR](imi-bigpicture/wsidicom#149)
    in order to view an example dataset.
    
    The following were used for testing:
    
    url = 'https://proxy.imaging.datacommons.cancer.gov/current/viewer-only-no-downloads-see-tinyurl-dot-com-slash-3j3d9jyp/dicomWeb'
    study_uid = '2.25.25644321580420796312527343668921514374'
    series_uid = '1.3.6.1.4.1.5962.99.1.3205815762.381594633.1639588388306.2.0'
    
    Signed-off-by: Patrick Avery <[email protected]>
    psavery committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    63b7011 View commit details
    Browse the repository at this point in the history
  2. Change defaults for importing

    These work well for importing examples from the Imaging Data Commons.
    
    It takes a while to import, so a limit of 10 series seems reasonable
    for now.
    
    Also, it's good to have a default filter of "Modality": "SM", because
    otherwise, we would mostly receive non-WSI series.
    
    Signed-off-by: Patrick Avery <[email protected]>
    psavery committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    2a0b85c View commit details
    Browse the repository at this point in the history
  3. Apply limit and filters to studies, not series

    We were previously just performing a `search_for_series()` and applying
    the limit and filters to that search.
    
    However, it is probably more intuitive for users to be searching for
    studies, rather than series. So we are now performing a `search_for_studies()`
    first, applying the limit and filters to this, and then locating all
    series within those studies, and proceeding from there.
    
    Signed-off-by: Patrick Avery <[email protected]>
    psavery committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    642961b View commit details
    Browse the repository at this point in the history
  4. Fix tests to take into account search by studies

    We are now searching by studies, not series. The tests need to be
    fixed to take this into account.
    
    Signed-off-by: Patrick Avery <[email protected]>
    psavery committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    1091fef View commit details
    Browse the repository at this point in the history
  5. Fix indentation in assetstoreImport template

    Signed-off-by: Patrick Avery <[email protected]>
    psavery committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    e2c1334 View commit details
    Browse the repository at this point in the history
  6. Don't infer file size right away

    Importing the first 10 studies on IDC has been taking around 20 minutes.
    About 65% of that time has been spent on inferring file sizes.
    
    Even though we don't stream the file data, the request from which we get
    the content length must be taking some time on the server. Skip doing
    this for now. We can add it back in if we figure out a way to make it much
    faster.
    
    Signed-off-by: Patrick Avery <[email protected]>
    psavery committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    e471abe View commit details
    Browse the repository at this point in the history