fix2-make_fragment_file

timonschlegel · Jun 13, 2024 · fabc4d4 · fabc4d4
1 parent 4bebee4
commit fabc4d4
Showing 1 changed file with 33 additions and 27 deletions.
diff --git a/topics/single-cell/tutorials/scatac-standard-processing-snapatac2/tutorial.md b/topics/single-cell/tutorials/scatac-standard-processing-snapatac2/tutorial.md
@@ -101,7 +101,7 @@ SnapATAC2 requires 3 input files for the standard pathway of processing:
 > - This tutorial starts with a `fragment` file. 
 > - SnapATAC2 also accepts mapped reads in a `BAM` file.
 > - To learn how to get a `fragment` file or `BAM` file from raw `.FASTQ`-reads, please check out the tutorial ["Pre-processing of 10X Single-Cell ATAC-seq Datasets"]( {% link topics/single-cell/tutorials/scatac-preprocessing-tenx/tutorial.md %} )
-> - If you would like to start the analysis with a `BAM` file, you can expand the details section [**Creating a fragment file**]( {% link topics/single-cell/tutorials/scatac-standard-processing-snapatac2/tutorial.md %}#details-creating-a-fragment-file). 
+> - If you would like to start the analysis with a `BAM` file, you can expand the details section ["Details: Creating a fragment file"]( {% link topics/single-cell/tutorials/scatac-standard-processing-snapatac2/tutorial.md %}#creating-a-fragment-file). 
 {: .comment}
 
 
@@ -126,7 +126,8 @@ SnapATAC2 requires 3 input files for the standard pathway of processing:
 > 3. Rename the datasets
 >   - {% icon galaxy-pencil %} **Rename** the file `atac_pbmc_5k_nextgem_fragments.tsv` to `fragments_file.tsv`
 >   - {% icon galaxy-pencil %} **Rename** the file `gencode.v46.annotation.gtf.gz` to `gene_annotation.gtf.gz`
->     {% snippet faqs/galaxy/datasets_rename.md %}
+> 
+>    {% snippet faqs/galaxy/datasets_rename.md %}
 >
 > 4. Inspect `chrom_sizes` and `fragments_file` 
 {: .hands_on}
@@ -145,27 +146,30 @@ SnapATAC2 requires 3 input files for the standard pathway of processing:
 >
 {: .question}
 
+## Creating a fragment file
 > <details-title>Creating a fragment file</details-title>
-> > <hands-on-title>Data upload</hands-on-title>
-> > 1. Import the file  `BAM_500-PBMC` from [Zenodo]({{ page.zenodo_link }}) or from the shared data library
-> > ```
-> > {{ page.zenodo_link }}/files/atac_pbmc_5k_nextgem_fragments.tsv
-> > ```
-> >   - This dataset contains mapped reads in the `BAM` format. 
-> >   - It was generated by following the tutorial ["Pre-processing of 10X Single-Cell ATAC-seq Datasets"]( {% link topics/single-cell/tutorials/scatac-preprocessing-tenx/tutorial.md %} ) until the output of {% tool [Map with BWA-MEM](toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.18) %}
-> > 2. {% tool [SnapATAC2 Preprocessing](toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.5.3+galaxy1) %} with the following parameters:
->    - *"Method used for preprocessing"*: `Convert a BAM file to a fragment file, using 'pp.make_fragment_file'`
->        - {% icon param-file %} *"File name of the BAM file"*: `BAM_500-PBMC` (Input dataset)
->        - {% icon param-toggle %} *"Indicate whether the BAM file contain paired-end reads"*: `Yes`
->        - *"How to extract barcodes from BAM records?"*: `From read names using regular expressions`
->          - *"Extract barcodes from read names of BAM records using regular expressions"*: `(................):`
->        > <comment-title></comment-title>
->        > - Not every regular expression type is supported. 
->        > - This expression selects 16 characters if they are followed by a colon `:`. Only the cell barcodes of the `BAM` file will match. 
->        {: .comment}
-> > 3. Rename the generated file to `Fragments 500 PBMC`
-> > 4. Now you can continue with either the `fragments_file` from earlier, or the new file `Fragments 500 PBMC`. 
-> >    - {% icon galaxy-info %} Please note that `Fragments 500 PBMC` only contains 500 {PBMC} and thus the clustering will produce different outputs compared to the outputs generated by `fragments_file` (with 5k PBMC). 
+>  > <hands-on-title>fragment file</hands-on-title>
+>  > 1. Import the file  `BAM_500-PBMC` from [Zenodo]({{ page.zenodo_link }}) or from the shared data library
+>  > ```
+>  > {{ page.zenodo_link }}/files/atac_pbmc_5k_nextgem_fragments.tsv
+>  > ```
+>  >   - This dataset contains mapped reads in the `BAM` format. 
+>  >   - It was generated by following the tutorial ["Pre-processing of 10X Single-Cell ATAC-seq Datasets"]( {% link topics/single-cell/tutorials/scatac-preprocessing-tenx/tutorial.md %} ) until the output of {% tool [Map with BWA-MEM](toolshed.g2.bx.psu.edu/repos/devteam/bwa/bwa_mem/0.7.18) %}
+>  > 2. {% tool [SnapATAC2 Preprocessing](toolshed.g2.bx.psu.edu/repos/iuc/snapatac2_preprocessing/snapatac2_preprocessing/2.5.3+galaxy1) %} with the following parameters:
+>  >    - *"Method used for preprocessing"*: `Convert a BAM file to a fragment file, using 'pp.make_fragment_file'`
+>  >        - {% icon param-file %} *"File name of the BAM file"*: `BAM_500-PBMC` (Input dataset)
+>  >        - {% icon param-toggle %} *"Indicate whether the BAM file contain paired-end reads"*: `Yes`
+>  >        - *"How to extract barcodes from BAM records?"*: `From read names using regular expressions`
+>  >          - *"Extract barcodes from read names of BAM records using regular expressions"*: `(................):` 
+>  > 
+>  >    > <comment-title></comment-title>
+>  >    > - Not every regular expression type is supported. 
+>  >    > - This expression selects 16 characters if they are followed by a colon. Only the cell barcodes of the `BAM` file will match. 
+>  >    {: .comment}
+>  > 
+>  > 3. Rename the generated file to `Fragments 500 PBMC`
+>  > 4. Now you can continue with either the `fragments_file` from earlier, or the new file `Fragments 500 PBMC`. 
+>  >    - {% icon galaxy-info %} Please note that `Fragments 500 PBMC` only contains 500 {PBMC} and thus the clustering will produce different outputs compared to the outputs generated by `fragments_file` (with 5k PBMC). 
 >  {: .hands_on}
 {: .details}
 
@@ -190,11 +194,13 @@ The [`AnnData`](https://anndata.readthedocs.io/en/latest/) format was initially
 >        - {% icon param-file %} *"Fragment file, optionally compressed with gzip or zstd"*: `fragments_file.tsv` (Input dataset)
 >        - {% icon param-file %} *"A tabular file containing chromosome names and sizes"*: `chrom_sizes.txt` (Input dataset)
 >        - {% icon param-toggle %} *"Whether the fragment file has been sorted by cell barcodes"*: `No` 
->          > <details-title>Sorted by barcodes</details-title>
->          > - This tool requires the fragment file to be sorted according to cell barcodes. 
->          > - If **pp.make_fragment_file** {% icon tool %} was used to generate the fragment file, this has automatically been done. 
->          >   - Otherwise, the setting *"sorted by cell barcodes"* should remain `No`. 
->          {: .details}
+>
+>    > <details-title>Sorted by barcodes</details-title>
+>    > - This tool requires the fragment file to be sorted according to cell barcodes. 
+>    > - If **pp.make_fragment_file** {% icon tool %} was used to generate the fragment file, this has automatically been done. 
+>    >   - Otherwise, the setting *"sorted by cell barcodes"* should remain `No`. 
+>    {: .details}
+> 
 > 2. Rename the generated file to `Anndata 5k PBMC`
 >
 > 3. Check that the format is `h5ad`