-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow non-coordinate-sorted bam input to eXpress #581
base: main
Are you sure you want to change the base?
Conversation
Thanks for the contribution. The checks are currently not running because github actions changed a bit. They should run again after #582 is merged. Q: Should the sorted bam types be removed? Also a bump of the tool version is necessary. |
Can you rebase the PR branch. Then Tests will run. |
Thanks for the fixes to allow the checks to pass. I tried removing the sorted bam input from the wrapper, but in my local tests it seems that sorted bam is still accepted as input, causing eXpress to fail. I've made the change in the wrapper however, since this should be an invalid input type, and will file an issue with Galaxy main regarding disallowing sorted bam in such cases. I see that there is a tool version in the wrapper, and an eXpress version in tool_dependencies.xml. These are both set to 1.1.1 currently - is this coincidence? I assume I should only bump the tool wrapper version number. |
You can remove the file |
I've removed that file, and updated eXpress to the most recent version I'm seeing in bioconda. |
I should have asked this earlier: What is the exact problem with sorted bam file. The manual suggests that sorting is necessary: If you aligned your reads with Bowtie, your alignments will be properly ordered already. If you used another tool, you should ensure that they are properly sorted |
Sorry I should have been explicit referring to "sorted" - eXpress does require sorted input, but it needs to be sorted by read name rather than coordinate.
Galaxy seems to define the "bam" type as being coordinate sorted, and when I used a query-name sorted bam as input to the current eXpress wrapper, Galaxy converted it to a coordinate-sorted bam before running eXpress, causing eXpress to fail. Adding qname_sorted.bam type allows this bam to be used as is, sorted by read name. |
<description>Quantify the abundances of a set of target sequences from sampled subsequences</description> | ||
<requirements> | ||
<requirement type="package" version="1.1.1">eXpress</requirement> | ||
<requirement type="package" version="1.5.1">eXpress</requirement> | ||
</requirements> | ||
<command> | ||
express --no-update-check |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
express --no-update-check | |
#if $bamOrSamFile.ext == "qname_sorted.bam" | |
ln -s `$bamOrSamFile` hits.sorted | |
#else | |
samtools sort -n '$bamOrSamFile' hits.sorted | |
#end if | |
express --no-update-check |
Plus: replace $bamOrSamFile
on the last line of the command block with hits.sorted
(btw. all input files need to be single quoted).
Maybe in the else branch one could check with samtools view -H ... | grep -q "@HD\t.*\tSO:unsorted"
if the sam / bam file is sorted before calling sort again.
eXpress requires randomly sorted bams as input, but having the input type set to "bam" causes the bam to be coordinate sorted before running. This PR adds "qname_sorted.bam" and "unsorted.bam" input types to avoid this sorting.