Skip to content

Rotation in Celina Juliano's lab studying gene expression timeline during hydra regeneration using scRNAseq after bisection

Notifications You must be signed in to change notification settings

marionhardy/hydra-regen

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Interstitial cells transcription factors in Hydra Vulgaris regeneration over 96 hours

Rotation in Celina Juliano's lab studying the timeline of hydra regeneration after bisection using scRNAseq obtained from Panagiotis Papasaikas from the Tsiairis lab.

Introduction

Data from our collaborator Panagiotis rds file for a SingleCellExperiment object containing the single cell data for the interstitial cells of Hydra Vulgaris during multiples stages of regeneration after bisection:

https://doi.org/10.1101/2024.02.08.579449

BUT they mapped it to Hydra Magnipapillata (102 version of 105) "Quantification of the generated single cell libraries was performed using the Salmon-Alevin software suite (Salmon version 1.6.0) against the ncbi Hydra 102 transcriptome."

The sce object contains only the interstitial cells that were selected by Panagiotis using the ..... markers -> Not indicated in the preprint either.

The coldata of the object contain cell annotation including

  • quality metrics: nFeature nCount (not MT percentage interestingly)

  • batch info: either 2869 (3162 barcodes), 3113 (10352 barcodes), 3271 (13279 barcodes), 3357 (3875 barcodes)

  • originating experiments (head or foot regeneration)

  • experimental time points

  • pseudo-axis assignment (vals.axis ranging from -1 to 1, increasing in the foot-tentacle direction)

  • mitotic and apoptotic signatures indices from 0 to 1

The rowdata contains gene annotation, using Entrez-gene identifiers. I have also noticed that in the sce objects there's

  • PCA, tSNE and UMAP coordinates for reduced dimensions + corrected for batch values

  • assay metafeatures hold gene_id, product, gene, is.rib.prot.gene (T/F), HypoMarkers (T/F), ccyle (T/F), apopt (T/F) etc

I converted the sce objects into a seurat object and did data processing + analysis.

Reports summary

  • interstitial_report1
  • interstitial_report2_cluster_attribution
  • interstitial_report3
  • interstitial_report4_cluster_attribution
  • interstitial_report5_transcription_timeline
  • Interstitial_report6_population_evolution
  • Interstitial_report7_population_evolution
  • PowerPoint presentation I gave to the lab in the last week of my rotation
  • seurat_all_modifications_reports_1to7.R

Interstitial_report1

  • Contains the exploration of the data (batches, features, metadata etc)
  • Quality check based on nFeatures and nCounts (no mitochondrial genes in the 102 genome)
  • Scaling using SCTransform() and regression (or not) of the batch variable
  • Selection of the first 28 dimensions for UMAP projection
  • PCA, UMAP, Clustering at different resolutions (0.025, 0.1, 0.3)

Interstitial_report2_cluster_attribution

Subsetting by head/foot and timepoint for both regressed and unregressed Not batch regressed Batch regressed Cluster attribution UMAP at 0.025 resolution Finding markers per clusters

  • DotPlot of top 5 differentially expressed markers per clusters
  • DotPlot of theoretical markers for neurons and other interstitial cells
  • UMAPs of names markers (excludes uncharacterized transcripts)

Interstitial_report3

Contains the same thing as the first report but explores the n_neighbors parameter

Does not contain the regressed data as the batch and timepoints variables overlapped weirdly in the experimental design

Interstitial_report4_cluster_attribution

Contains the same things as report2 BUT

  • more markers
  • plots using the regressed data were discarded
  • clusters were attributed
  • we now have a first glimpse at transcription factor expression

Interstitial_report5_transcription_timeline

Subclustered to only neurons in order to study neuron subtypes and appearances throughout regenaration. Allows for FindMarkers between clusters and timepoints.

Filtered to neurons only and re-clustered:

  • PCA, UMAP at 0.025 resolution
  • New cluster attribution (9 neuronal cell types)
  • Subgrouped per timepoints and per regenerating organ (head vs foot)
  • Plotting of cell population number over time (head vs foot)
  • Transcription factor of interest expression: Feature plot and clustered dotplot

Interstitial_report6_population_evolution

Contains:

  • Notes on a meeting with Celina and Hannah and what to change in report 5
  • UMAP and cluster labelling at resolution = 0.1 instead of 0.025
  • Subsetting by cluster and over time
  • Evolution of cell population over time: Raw counts and percentage composition of the Hydra + divided by head of foot
  • Transcription factor expression: Feature plot all timepoints + Dotplot over time
  • Transcription factor expression: Over time

This report describes a changing cell population and ties it back to specific transcription factor expression.

image

Interstitial_report7_population_evolution

  • Notes on a meeting with Celina and Hannah and what to change in report 6
  • Relabelling the clusters (although I believe the en cluster should be changed back to doublet/triplet because of the apoptosis score)
  • Plotting cell cycle scores, axis scores and apoptosis scores

This is the report that was used to make the powerpoint I added here.

image

seurat_all_modifications_reports_1to7.R

Summary of all operations done to the initial Seurat assay. This should make it easier for anyone to pick up the project/data analysis in the future.

Nota Bene

The 102 genome is very fractionated and is poorly annotated compared to current available constructs. I had to manually cross 105 v3 genome for the previously established celltype markers with the 105 v102 genome. I did that by taking the 105 v3 fasta file Hannah gave me, blasting every established cell type marker transcript and finding if there's a >80% match for a transcript in 102. Then I went back to the annotation file and changed the name of the transcript in a duplicated Symbol column called Symbol_updated. I also manually checked annotated genes that had a transcript in 102 but the ID didn't reflect it. This is stored in an excel document called "mcbi_dataset_MH_annotated.xlsx"

About

Rotation in Celina Juliano's lab studying gene expression timeline during hydra regeneration using scRNAseq after bisection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages