Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding HalfDeep #6592

Merged
merged 13 commits into from
Dec 5, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions tools/halfdeep/.shed.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
categories:
- Sequence Analysis
description: "HalfDeep: Automated detection of intervals covered at half depth by sequenced reads."
homepage_url: https://github.com/makovalab-psu/HalfDeep
long_description: |
Automated detection of intervals covered at half depth by sequenced reads.
name: halfdeep
owner: iuc
remote_repository_url: https://github.com/galaxyproject/tools-iuc/tree/main/tools/halfdeep
type: unrestricted
90 changes: 90 additions & 0 deletions tools/halfdeep/halfdeep.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
<tool id="halfdeep" name="HalfDeep" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@">
<description>identifies genomic regions with half-depth coverage based on sequencing read mappings.</description>
<macros>
<import>macros.xml</import>
</macros>
<expand macro="requirements"/>
<command detect_errors="exit_code"><![CDATA[
##
## Set up the directory structure expected by bam_depth.sh and halfdeep.sh
## See: https://github.com/makovalab-psu/HalfDeep?tab=readme-ov-file#expected-directory-layout
##
mkdir -p reads halfdeep/ref/mapped_reads &&
##
## reference
##
ln -s '$ref' 'ref.$ref.ext' &&
#if not $mapped_reads
minimap2 -x map-pb -d ref.idx 'ref.$ref.ext' &&
#else
touch ref.idx &&
#end if
##
## reads
##
#import re
#set $reads_base = re.sub('[^\w\-\s]', '_', str($reads.element_identifier))
ln -s '$reads' 'reads/${reads_base}.$reads.ext' &&
echo 'reads/${reads_base}.$reads.ext' >> input.fofn &&
##
## mapped reads
##
#if $mapped_reads
ln -s '$mapped_reads' 'halfdeep/ref/mapped_reads/${reads_base}.bam' &&
ln -s '${reads_base}.bam' 'halfdeep/ref/mapped_reads/${reads_base}.sort.bam' &&
ln -s '$mapped_reads.metadata.bam_index' 'halfdeep/ref/mapped_reads/${reads_base}.sort.bam.bai' &&
#end if
##
## run bam_depth.sh
##
bam_depth.sh 'ref.$ref.ext' 1 &&
##
## run halfdeep.sh
##
halfdeep.sh 'ref.$ref.ext'
]]></command>
<inputs>
<param name="ref" type="data" format="fasta,fasta.gz" label="Genome Assembly" help="A Genome Assembly in FASTA format."/>
<param name="reads" type="data" format="fastqsanger,fastqsanger.gz" label="Sequencing Reads" help="Sequencing Reads for the Genome Assembly in FASTQ format."/>
<param name="mapped_reads" type="data" format="bam" value="" optional="true" label="Aligned Reads" help="Alignments of the Sequencing Reads to the Genome Assembly in BAM format."/>
</inputs>
<outputs>
<data name="halfdeep_dat" format="bed" from_work_dir="halfdeep/ref/halfdeep.dat" label="HalfDeep on ${on_string}"/>
</outputs>
<tests>
<test expect_num_outputs="1">
<param name="ref" value="ref.fasta.gz" ftype="fasta.gz"/>
<param name="reads" value="reads.fasta.gz" ftype="fasta.gz"/>
<param name="mapped_reads" value="mapped_reads.bam" ftype="bam"/>
richard-burhans marked this conversation as resolved.
Show resolved Hide resolved
<output name="halfdeep_dat" file="halfdeep.bed" ftype="bed"/>
</test>
<test expect_num_outputs="1">
<param name="ref" value="ref.fasta.gz" ftype="fasta.gz"/>
<param name="reads" value="reads.fasta.gz" ftype="fasta.gz"/>
<output name="halfdeep_dat" file="halfdeep.bed" ftype="bed"/>
</test>
</tests>
<help><![CDATA[

HalfDeep identifies genomic regions with half-depth coverage based on sequencing read mappings. These regions may reveal insights into heterogametic sex chromosomes, haplotype-specific variation, or potential assembly errors such as heterotypic duplications.

Given the following inputs:

1. A genome assembly in FASTA format.
2. Reads in FASTQ format.
3. Mapped reads in BAM format (optional)

HalfDeep automates the following tasks:

1. Mapping reads and merging individual mapping files.
2. Calculating per-base read depth.
3. Smoothing read coverage using a defined window with genodsp.
4. Determining the percentile of read coverage.
5. Identifying genomic regions with half-depth coverage based on a specified percentile threshold (e.g., 40–60%) and exporting them in BED file format

HalfDeep produces the following output:

1. HalfDeep: BED file containing regions of the genome assembly that are "covered at half depth"
]]></help>
<expand macro="citations"/>
</tool>
23 changes: 23 additions & 0 deletions tools/halfdeep/macros.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
<macros>
<xml name="requirements">
<requirements>
<requirement type="package" version="@TOOL_VERSION@">halfdeep</requirement>
</requirements>
</xml>
<token name="@TOOL_VERSION@">0.1.0</token>
<token name="@VERSION_SUFFIX@">0</token>
<token name="@PROFILE@">21.05</token>
<xml name="citations">
<citations>
<citation type="bibtex">
@misc{github_halfdeep,
author = {Makova Lab PSU},
year = "2019",
title = {HalfDeep},
publisher = {GitHub},
journal = {GitHub repository},
url = {https://github.com/makovalab-psu/HalfDeep}
</citation>
</citations>
</xml>
</macros>
Empty file.
Binary file added tools/halfdeep/test-data/mapped_reads.bam
Binary file not shown.
Binary file added tools/halfdeep/test-data/reads.fasta.gz
Binary file not shown.
Binary file added tools/halfdeep/test-data/ref.fasta.gz
Binary file not shown.
Loading