Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating profile & consensus docs #75

Merged
merged 29 commits into from
Aug 4, 2021

Conversation

michaelbornholdt
Copy link
Contributor

Related to issue #73

@michaelbornholdt
Copy link
Contributor Author

@shntnu @gwaygenomics @FloHu let's continue the discussion here

@michaelbornholdt
Copy link
Contributor Author

cytominer_workflow_new

@michaelbornholdt
Copy link
Contributor Author

I think I still need to add a part where I explain where the profiles and consensus can be found in the repo. That should go in the top level README

@michaelbornholdt
Copy link
Contributor Author

@gwaygenomics

DMSO vs. Compound Profiles

Note we generated per-well DMSO consensus signatures and per compound-dose pair consensus signatures for compounds.
The per-well DMSO profiles can help to assess plate-associated batch effects.

I have read this part of the consensus/README.md several times and I don't understand what it is trying to tell me. Can we rewrite that?

@gwaybio
Copy link
Member

gwaybio commented Jul 26, 2021

I have read this part of the consensus/README.md several times and I don't understand what it is trying to tell me. Can we rewrite that?

Yes! We should clarify anything that isn't clear.

It's just saying that the consensus signatures for DMSO create profiles per well instead of per treatment. Does that make sense? One might assume that we treat DMSO like any other perturbation when creating consensus profiles - that is, combine ALL replicates into a single signature. We don't do that. Instead, we collapse per well, thus creating ~24 or so (I don't remember the exact number) DMSO consensus profiles.

@michaelbornholdt michaelbornholdt marked this pull request as ready for review July 27, 2021 17:37
@michaelbornholdt
Copy link
Contributor Author

@niranjchandrasekaran Can you review and merge this?
I don't have any rights here :)

@FloHu any last comments?

Copy link
Member

@niranjchandrasekaran niranjchandrasekaran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelbornholdt I made a few inline comments. Also there is typo in the feature_select box of the figure in the word "operations".

I am not approving or merging this PR yet as I want @gwaygenomics to take a look at your changes since he should be the one who decides what goes where in this repo :)

README.md Outdated
@@ -19,10 +19,11 @@ The [Morphology Connectivity Hub](https://clue.io/morphology) is the primary sou
We apply a unified, image-based profiling pipeline to all 136 384-well plates from `LINCS Pilot 1`, and all 135 384-well plates from `LKCP`.
We use [pycytominer](https://github.com/cytomining/pycytominer) as the primary tool for image-based profiling.

We process and store profiles in the [profiles/](profiles/) directory.
See [`profiles/README.md`](profiles/README.md) for more details and for instructions on how to reproduce the pipeline.
We process and store profiles in the [profiles/](profiles/) directory. By 'profiles' we refer to the level 3 data, which contains the aggregate of each well.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We call everything from level 3-5 profiles. Also doesn't this folder contain level 4a and 4b profiles?

profiles/README.md Outdated Show resolved Hide resolved

Note here that we do not include the intermediate step of generating `.sqlite` files per plate using a tool called [cytominer-database](https://github.com/cytomining/cytominer-database).
This repository and workflow begins after we applied cytominer-database.


### Aggregation
The [aggregation method](https://github.com/cytomining/pycytominer/blob/master/pycytominer/aggregate.py) is used twice in the workflow. Firstly, the median of all cells within a well is aggregated to one profiler per well. The aggregation method doesn't persist the metadata which is why this step is followed by an annotation step to add the MOA data and others.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of thinking about creating the consensus profiles as another aggregation step. But given that you have a separate section for consensus below perhaps you could talk about aggregating 4b to consensus profiles in that section and not here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelbornholdt - can you address this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have, there are two different sections now that explain aggregation and consensus.

[Pycytominer](https://github.com/cytomining/pycytominer) is a code base built by @gwaygenomics and @niranjchandrasekaran.
It allows easy processing CellProfiler data and contains all functions that were used to create the data in this repository. Below, we describe the different steps of the pipeline. Please check the pycytominer repo for more details.

Part of the pipeline, from Level 3 to Level 4b, can be found in the [profile_cells](https://github.com/broadinstitute/lincs-cell-painting/blob/master/profiles/profile_cells.py) script and the final aggregation to the consensus data is found in this [notebook](https://github.com/broadinstitute/lincs-cell-painting/blob/master/consensus/build-consensus-signatures.ipynb).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think including the notebook that contains the code for sphering should also be mentioned here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@michaelbornholdt - can you address this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have

profiles/README.md Outdated Show resolved Hide resolved
profiles/README.md Outdated Show resolved Hide resolved
@FloHu
Copy link

FloHu commented Jul 28, 2021

@niranjchandrasekaran Can you review and merge this?
I don't have any rights here :)

@FloHu any last comments?

I would still clarify what is the difference in the plate normalization and the normalization by spherization, i.e. which artifacts are corrected in the former and which ones in the latter and where they occur. For people from outside the imaging field, e.g. in some screen types people just normalize on a plate by plate basis and that takes care of the batch effects.

Overall, great job, things are a lot clearer now. Thanks for the effort!

@gwaybio
Copy link
Member

gwaybio commented Jul 28, 2021

Thanks @michaelbornholdt - I will provide my review once you address @niranjchandrasekaran's comments

@michaelbornholdt
Copy link
Contributor Author

@gwaygenomics
Ready for you. I addressed Flos and Niranj's comments.

Copy link
Member

@gwaybio gwaybio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @michaelbornholdt - I made several suggested changes (note that you can commit these directly from the github GUI, so there is no need to change again on your end)

The three primary comments are:

  • Make sure you make formatting consistent. This means proper spacing after headings and converting to one-sentence-per-line. I tried to adjust every instance, but it's likely i missed some.
  • Try your best to edit this document to active (instead of passive) voice. For example, write like this: "Unlike the other normalizations, we spherize the data using the full batch (all plates)." and not like this: "Unlike the other normalizations, spherizing is done on the full batch (all plates)" It makes reading such a more pleasant experience :)
  • Make your png overwrite the old png - no need for a new file name

Thanks again!

README.md Outdated Show resolved Hide resolved
consensus/README.md Outdated Show resolved Hide resolved
consensus/README.md Outdated Show resolved Hide resolved
consensus/README.md Outdated Show resolved Hide resolved
profiles/README.md Outdated Show resolved Hide resolved
profiles/README.md Outdated Show resolved Hide resolved
profiles/README.md Show resolved Hide resolved
profiles/README.md Show resolved Hide resolved
profiles/README.md Show resolved Hide resolved
profiles/README.md Show resolved Hide resolved
@michaelbornholdt
Copy link
Contributor Author

@niranjchandrasekaran @gwaygenomics
I tried to rewrite in active language and accepted all the suggestions above.

@michaelbornholdt
Copy link
Contributor Author

Why do the suggestions of Grag still appear in the other tab?

batch commit to clear my suggestions
@gwaybio
Copy link
Member

gwaybio commented Aug 4, 2021

@michaelbornholdt - it also looks like you need to add back your new-and-improved workflow figure: profiles/media/cytominer_workflow.png

@michaelbornholdt
Copy link
Contributor Author

@michaelbornholdt - it also looks like you need to add back your new-and-improved workflow figure: profiles/media/cytominer_workflow.png

The newest version is there!? Or what am I missing?

Copy link
Member

@gwaybio gwaybio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! Yeah, I misread the commit

image

LGTM! I'll go ahead and merge. Thanks for this contribution!

profiles/README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
profiles/README.md Show resolved Hide resolved
profiles/README.md Show resolved Hide resolved
profiles/README.md Show resolved Hide resolved
profiles/README.md Show resolved Hide resolved
@gwaybio gwaybio merged commit 1769b32 into broadinstitute:master Aug 4, 2021
@shntnu
Copy link
Collaborator

shntnu commented Aug 22, 2021

Just dropping in to say that this figure was really helpful to me – I had forgotten the order of operations leading up to spherizing and this cleared it up. Thanks again @michaelbornholdt ! 🥇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants