Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DeidentificationMethod and DeidentificationMethodCodeSequence to json #813

Merged

Conversation

CGSchwarzMayo
Copy link
Contributor

This parallels my PR to bids-specification:
bids-standard/bids-specification#1772
bids-standard/bids-specification#1709

The goal is to add DeidentificationMethod and DeidentificationMethodCodeSequence to json, so that users downstream have a chance to be informed of whether some tags were scrubbed and/or if any de-facing or other image scrubbing was applied.

From our email discussion, I understand that you will probably not merge this until bids-specification merges theirs. I thought we could iterate on this, if needed, in the meantime.

Thanks for your consideration,
Chris Schwarz
Mayo Clinic

@neurolabusc
Copy link
Collaborator

@CGSchwarzMayo thanks for your work on this. I do like the basic idea, but it is worth noting that a lot of the speed of dcm2niix derives from a one-pass conversion where we need to be mindful of the memory demands. Already, we compile dcm2niix with extra stack size for the Windows operating system. It looks to me that we need to pre-allocate 10 fields (MAX_DEID_CS) each with three kDICOMStr (66 bytes) and one kDICOMStrLarge (256 bytes), so the footprint is an additional 4560 bytes for each and every DICOM file. The concern here is if a user wants to process a top level folder that contains tens of thousands of individual DICOM files. If we really need this level of verbosity, I think we will have to read this once per output NIfTI image (e.g. once for an entire 4D fMRI dataset), not once per DICOM file (e.g. once per 2D slice for non-enhanced DICOM).

Let me see if I work with your PR and the validation dataset you provided to retain the functionality you propose without the overhead.

@CGSchwarzMayo
Copy link
Contributor Author

Thank you for your help and consideration! I agree that it would be reasonable to assume that this tag is consistent across an image series.

if ((slen < 10) || (strstr(anonTxt, "DICOMANON") == NULL))
dcmStr(lLength, &buffer[lPos], d.deidentificationMethod);
int slen = (int)strlen(d.deidentificationMethod);
if ((slen < 10) || (strstr(d.deidentificationMethod, "DICOMANON") == NULL))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In example output within bids-standard/bids-specification#1772 (comment)
i see that original list of strings

(0012, 0063) De-identification Method            LO: ['mri_reface 0.3.4', 'Per DICOM PS 3.15 AnnexE. Details in 0012,0064']

becomes a string

        "DeidentificationMethod": "mri_reface 0.3.4\\Per DICOM PS 3.15 AnnexE. Details in 0012,0064",

in JSON.

  • Would it be preferable to retain list here?
  • Is \\ (I guess just an escaped \?) a general convention within dcm2niix to join multiple entries of the list while serializing into a string? is it up for dcm2niix specific place to choose either to retain list or serialize into a string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time when I wrote that, I actually hadn't noticed yet that VM on (0012,0063) is n-1, not 1, so I was working on a principle that strings after the first would be concatenated into the existing single string rather than appended as additional strings. If BIDS and dcm2niix-json can both handle lists within their format specs, retaining them as lists would make more sense, and I have no objection to that change.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BIDS can definitely handle that. Added a suggestion over on the spec PR.

@effigies
Copy link

@neurolabusc Just wanted to check in on the status of this. I'm preparing to release BIDS 1.10.0, and if dcm2niix is happy with the state of the associated PR, that can be merged. Otherwise it can be postponed to a later release.

@neurolabusc neurolabusc merged commit fda03a5 into rordenlab:development Aug 27, 2024
1 check passed
@neurolabusc
Copy link
Collaborator

@yarikoptic I have merged the PR as provided, and it does create valid JSONs. If you have a strong prefernce for changing this, can you generate a new PR with your proposed modifications. I will create a new dcm_qa_* validation repository based on the fruit sample provided in the BIDS discussion linked above.

@CGSchwarzMayo thanks for seeing this submission through, it is nice to have code, specification and sample data. This is a model of how we can implement, document, and regression test new features.

@CGSchwarzMayo
Copy link
Contributor Author

Thanks Chris! I'm glad the sample data was helpful. You initially had concerns about the memory demands of my implementation, and you were going to see if you could re-write some aspects first, so I was surprised to see the merge as-is. Do you still plan to make additional changes before this leaves the Development branch?

@yarikoptic
Copy link
Contributor

@CGSchwarzMayo would you be so kind to do that refactoring into collecting a list in a PR? (I have not touched C++ for a while)

@CGSchwarzMayo
Copy link
Contributor Author

I attempted to do what @yarikoptic requested, in a new PR #857
Thanks,
Chris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants