Skip to content

isb-cgc/SL-Cloud-F1000

Repository files navigation

Synthetic Lethality Cloud (SL-Cloud)

This project provides a cloud-based data access platform coupled with software and well documented computational notebooks that re-implement published synthetic lethality (SL) inference algorithms to facilitate novel investigation into synthetic lethality. In addition we provide general purpose functions that support these prediction workflows e.g. saving data in bigquery tables. We anticipate that computationally savvy users can leverage the resources provided in this project to conduct highly customizable analysis based on their cancer type of interest and particular context.

Open the framework in MyBinder: Binder

Citation: Tercan B, Qin G, Kim TK et al. SL-Cloud: A Cloud-based resource to support synthetic lethal interaction discovery [version 2; peer review: 2 approved]. F1000Research 2022, 11:493 (https://doi.org/10.12688/f1000research.110903.2)

If you have any questions, please reach out Bahar Tercan [email protected].

Getting Started

Get a Google Identity

To be able to use our platform, researchers first need to have a Google identity, if you don't have one, please click here to get, you can also link a non-Gmail account(like sluser@isbscience.org) as a Google identity by this method.

Request Google Cloud Credits

Take advantage of a one-time $300 Google Credit. If you have already used this one-time offer (or there is some other reason you cannot use it), see this information about how to request ISB-CGC Cloud Credits.

Set up a Google Cloud Project

See Google’s documentation about how to create a Google Cloud Project.

Enable Required Google Cloud APIs

First Notebook

Please run the first notebook to start using our bigquery tables from your computer.

What is There in the Project?

Scripts

  • Scripts folder: includes the functions that are used by DAISY and Mutation Dependent SL Inference workflows explained below. This folder also contains scripts for data wrangling procedures like BigQuery dataset and table creation, how to save DEPMAP data in BigQuery tables, helper functions like writing dataframes into excel files and gene conversion among gene symbol, EntrezID and alias.

Sythetic Lethality Inference Workflows

Example notebooks can be found in the Example_pipelines directory, which including the following notebooks:

  • DAISY Pipeline :We reimplemented the published workflow DAISY (Jerby-Arnon et al., 2014) using up-to-date large scale data resources.
  • Mutation Dependent SL pipeline: We implemented a mutation-dependent synthetic lethality prediction (MDSLP) workflow based on the rationale that for tumors with mutations that have an impact on protein expression or protein structure (functional mutation), the knockout effects or inhibition of a partner target gene show conditional dependence for the mutated molecular entities.
  • Conservation-based Inference from Yeast Genetic Interactions: We presented a workflow that leverages cross-species conservation to infer experimentally-derived synthetic lethal interactions in yeast to predict relevant SL pairs in humans. We implemented the Conserved Genetic Interaction (CGI) workflow based, in part, on methods described in (Srivas et al., 2016).

Synthetic-Lethality Inference Data Resources

This resource provides access to publicly available cancer genomics datasets relevant for SL inference. These data have been pre-processed, cleaned and stored in cloud-based query-able tables leveraging Google BigQuery technology. In addition we leverage relevant datasets available through the Institute for Systems Biology Cancer Genomics Cloud (ISB-CGC) to make inferences of potential synthetic lethal interactions. The following represent project-specific datasets with relevance for SL inference:

  • DEPMAP: DEPMAP shRNA (DEMETER2 V6) and CRISPR (DepMap Public 20Q3) gene expression, sample information, mutation and copy number alterations for CRISPR experiments and and gene dependency scores for shRNA and gene effect scores.

  • CellMap: Yeast interaction dataset based on fitness scores after single and double knockouts from SGA experiments.

  • Gene Information: Tables with relevant gene annotation information such as yeast and human ortholog information, gene-alias-Entrez ID mapping, gene Ensembl-id mapping, gene-Refseq mapping.

Accessing ISB-CGC Resources

To be able to see the data in the ISB-CGC project, please click on https://console.cloud.google.com/bigquery and add the syntheticlethality dataset, users need to pin the syntheticlethality project by first clicking "ADD DATA" and after selecting "Pin a project" and "Enter project name", you will see the window as in the Figure below. After writing isb-cgc-bq into Projectname box, please click on PIN.