-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handbook discrepancy: metadata/<batch> vs metadata/<batch>/platemaps #70
Comments
I didn't quite follow (I bet it is me, not you; your note seems super clear so I'm probably just a bit of out the loop wrt the handbook) That said, perhaps this would help: Our cellpainting-gallery tree structure is consistent with the profiling-recipe Our instructions for data upload say this (i.e. sync So to keep things consistent, I think it is better to change the sync command profiling-handbook/05-create-profiles.md Line 322 in 2c4dc1b
to aws s3 sync s3://${BUCKET}/projects/${PROJECT_NAME}/workspace/metadata/ metadata/ (i.e. just download the whole thing) and then modify the handbook as needed so that the platemaps are expected to be found at |
Is it fair to say that your vote then is for the image analysts to stop storing the metadata as
Right, that's what you want the structure to be on the gallery, but I'm saying that unless someone was careful in doing the uploads (and notably did NOT do the sync you suggest there), they may not be, because historically, we've always on AWS stored metadata in Evidence - here is the suggested tree structure at commit 07e7691 (June 2021)
|
Tracing historically - as of 2835588 (June 19 2021), we were still on the old structure. Sometimes in the giant batch of commits on June 22-23, by the end of it (8ffd6c9), there are no metadata instructions at all. They get added back in as part of #63 August 2021, but with |
@shntnu and I chatted and here are the decisions we want to make going forward
Did I miss anything? |
Will the profiling recipe be modified to look in metadata/platemaps/batch/platemaps? (I'm assuming this is the only place where that metadata is used). |
That is already where the recipe looks! |
🤯 Oh wait, there's not actually anywhere where it 'looks' for platemaps on s3 right? We just download them to the backends machine and then from there is what's used in generating profiles? |
Yup |
Having the metadata broken out into platemaps and external is still a relatively new thing, and the image analysis team to date has typically only ever dealt with the platemap level metadata. Thus, on AWS, metadata is stored as
s3://${BUCKET}/projects/${PROJECT_NAME}/workspace/metadata/${BATCH_ID}
, as evidenced in this sync command.BUT, for the recipe, we eventually want it in
s3://${BUCKET}/projects/${PROJECT_NAME}/workspace/metadata/platemaps/${BATCH_ID}
, as depicted in this tree structure. We currently handle that with just adding that extraplatemaps
in during the sync command above.BUT, that's not what the handbook implies - it implies that it should be uploaded in the tree structure. So we basically have one of a couple of choices - 1) Explain everything I just wrote here in the handbook and show both tree structures there or 2) From now on, change how the analysts structure things and update the sync command, because presumably in places like the gallery in the future (but I don't know that anyone has checked for current bits of metadata over there), we want this new structure.
I'm guessing the preference here is 2, but we should make a decision and update the handbook accordingly.
The text was updated successfully, but these errors were encountered: