Skip to content

Commit

Permalink
minor changes
Browse files Browse the repository at this point in the history
  • Loading branch information
timonschlegel committed Jul 2, 2024
1 parent 4a03ca7 commit ad514c7
Showing 1 changed file with 7 additions and 6 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ objectives:
- Create a count-matrix from a 10X fragment file
- Perform filtering, dimension reduction and clustering on AnnData matrices
- Generate and filter a cell-by-gene matrix
- Identify marker genes for the clusters and annotate clusters
- Identify marker genes for the clusters and annotate the cell types
time_estimation: 2H
key_points:
- Single-cell ATAC-seq can identify open chromatin sites
Expand Down Expand Up @@ -176,7 +176,7 @@ SnapATAC2 requires 3 input files for the standard pathway of processing:
> >
> > 3. Rename the generated file to `Fragments 500 PBMC`
> > 4. Now you can continue with either the `fragments_file` from earlier or the new file `Fragments 500 PBMC`.
> > - {% icon galaxy-info %} Please note that `Fragments 500 PBMC` only contains 500 {PBMC} and thus the clustering will produce different outputs compared to the outputs generated by `fragments_file` (with 5k PBMC).
> > - {% icon galaxy-info %} Please note that `Fragments 500 PBMC` only contains 500 {PBMC's} and thus the clustering will produce different outputs compared to the outputs generated by `fragments_file` (with 5k PBMC).
> {: .hands_on}
{: .details}
Expand Down Expand Up @@ -451,7 +451,7 @@ Doublets are removed by calling a customized [**scrublet**](https://github.com/A
> - The observed features of the "cells" are then compared to the simulated doublets and scored on their doublet probability.
> - SnapATAC2's *pp.filter_doublets* then removes all cells with a doublet probability >50%.
>
> ![Doublet removal with scrublet]({% link topics/single-cell/images/scatac-standard-snapatac2/doublets-and-scrublet.png %} "Scrublet simulates expected doublets and produces doublet scores for each cell.")
> ![Doublet removal with scrublet]({% link topics/single-cell/images/scatac-standard-snapatac2/doublets-and-scrublet.png %} "Scrublet simulates expected doublets and produces doublet scores for each cell. ({% cite Wolock2019 %})")
>
{: .details}
Expand Down Expand Up @@ -502,13 +502,14 @@ Dimension reduction is a very important step during the analysis of single cell
>
> - Dimension reduction algorithms can be either linear or non-linear.
> - Linear methods are generally computationally efficient and well scalable.
>
> A popular linear dimension reduction algorithm is:
> - **PCA** (Principle Component Analysis), implemented in **Scanpy** (please check out our [Scanpy]({% link topics/single-cell/tutorials/scrna-scanpy-pbmc3k/tutorial.md %}) tutorial for an explanation).
> - Nonlinear methods however are well suited for multimodal and complex datasets.
> - in contrast to linear methods, which often preserve global structures, non-linear methods have a locality-preserving character.
> - This makes non-linear methods relatively insensitive to outliers and noise, while emphasizing natural clusters in the data ({% cite Belkin2003%})
> - As such, they are implemented in many algorithms to visualize the data in 2 dimensions (f.ex. **UMAP** embedding).
> - The nonlinear dimension reduction algorithm, through *spectral embedding*, used in **SnapATAC2** is a very fast and memory efficient non-linear algorithm ({% cite Zhang2024%}).
> - The nonlinear dimension reduction algorithm, through *matrix-free spectral embedding*, used in **SnapATAC2** is a very fast and memory efficient non-linear algorithm ({% cite Zhang2024%}).
> - **Spectral embedding** utilizes an iterative algorithm to calculate the **spectrum** (*eigenvalues* and *eigenvectors*) of a matrix without computing the matrix itself.
{: .details}
Expand All @@ -524,7 +525,7 @@ The dimension reduction, produced by the algorithm *tl.spectral*, is required fo
>
> > <comment-title> Distance metric </comment-title>
> >
> > - The fast and well scalable *matrix-free spectral embedding* algorithm depends on the distance metric: `cosine`
> > - The fast and well scalable *"matrix-free spectral embedding"* algorithm depends on the distance metric: `cosine`
> {: .comment}
>
> 2. Rename the generated file to `Anndata 5k PBMC spectral` or add the tag {% icon galaxy-tags %} `spectral` to the dataset
Expand Down Expand Up @@ -968,7 +969,7 @@ To manually annotate the *Leiden* clusters, we will need to perform multiple ste
# Conclusion
{% icon congratulations %} Well done, you’ve made it to the end! You might want to consult your results with this [control history](https://usegalaxy.eu/u/timonschlegel/w/workflow---standard-processing-of-10x-single-cell-atac-seq-data-with-snapatac2), or check out the [full workflow](https://singlecell.usegalaxy.eu/u/timonschlegel/w/2combined-snapatac2) for this tutorial.
{% icon congratulations %} Well done, you’ve made it to the end! You might want to consult your results with this [control history](https://singlecell.usegalaxy.eu/u/timonschlegel/h/test-of-5k-pbmc-tutorial-workflow), or check out the [full workflow](https://usegalaxy.eu/u/timonschlegel/w/workflow---standard-processing-of-10x-single-cell-atac-seq-data-with-snapatac2) for this tutorial.
In this tutorial, we produced a count matrix of {scATAC-seq} reads in the `AnnData` format and performed:
1. Preprocessing:
Expand Down

0 comments on commit ad514c7

Please sign in to comment.