-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error during quantification of FASTQ files #118
Comments
Hi @dBenedek-- Yes Whippet only accepts as input the standard four-line FASTQ file (described here: https://support.illumina.com/bulletins/2016/04/fastq-files-explained.html). The extra name id after the '+' is the problem. This is a standard expectation, but I'll update the documentation to reflect this. |
Thanks for the quick reply! |
Thanks for posting the issue @dBenedek, I got quite lost when I got this error... As this weird third row is quite common in my datasets, In case someone is interested, I managed to bring the third row to Illumina's standard third line and piped it into whippet-quant adding a small awk code (it replaces every 3 lines by "+"). In your example, this should work: R1=/home/bd1/mds-datasets-no-backup/dataset2/fastq/SRR6781235_1.fastq.gz
R2=/home/bd1/mds-datasets-no-backup/dataset2/fastq/SRR6781235_2.fastq.gz
julia /home/bd1/tools/Whippet.jl/bin/whippet-quant.jl \
<(zcat $R1 | awk -v count="0" '++count==3{{$0="+";count=-1}} 1') \
<(zcat $R2 | awk -v count="0" '++count==3{{$0="+";count=-1}} 1') \
-x /home/bd1/research_mds/data/whippet_index/whippet.jls \
-o test \
--biascorrect |
Hi MiqG, I tried your solution here, but failed to proceed whippet. Have you resolved it? Could you post your detailed solution here? Thanks. Best regards |
Hi Benedek, Have you resolved this issue? Thanks. Best regards, Mingming |
Hi, Do you have any suggestions on how to remove redundant information in the third line in the Fastq files? Best regards, ML |
Hello, Yes, sorry, I forgot to answer. Best wishes, |
Hi Benedek, Would you please share me how to correct the FASTQ files to remove the redundant information in the third line? I got the same problem, but failed to find a solution to modify the FASTQ files. Thanks so much in advance. Best regards, Mingming |
Use sed. xargs -P14 -I % sh -c 'zcat fastq/%_1.fastq.gz | sed "s/^+SRR.*/+/g" >fastq_whippet/%_1.fastq && gzip fastq_whippet/%_1.fastq;' and the same for the _2 files. |
Hi, sorry I forgot adding the "zcat" command to start reading and piping the fastq files. I have edited my previous answer to avoid it. Now the code looks like this: R1=/home/bd1/mds-datasets-no-backup/dataset2/fastq/SRR6781235_1.fastq.gz
R2=/home/bd1/mds-datasets-no-backup/dataset2/fastq/SRR6781235_2.fastq.gz
julia /home/bd1/tools/Whippet.jl/bin/whippet-quant.jl \
<(zcat $R1 | awk -v count="0" '++count==3{{$0="+";count=-1}} 1') \
<(zcat $R2 | awk -v count="0" '++count==3{{$0="+";count=-1}} 1') \
-x /home/bd1/research_mds/data/whippet_index/whippet.jls \
-o test \
--biascorrect I hope it works for you now. Cheers! |
Thanks @MiqG for your help. It works successfully. Best regards |
Thanks @dBenedek for your help. |
Hello,
I generated the Whippet index file:
And then approached to run the quantification:
The quantification step reports the following error message:
Is this error related to the different format of my FASTQ files?
My FASTQ files look like this:
Thanks,
Benedek
The text was updated successfully, but these errors were encountered: