You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi Felix,
I'm hoping you can help with an issue I'm having with paired end sequencing. I'm sequencing an amplicon that is 350bp long. The first ~300 bases on the forward read match what I expect. Biological and technical replicates match closely. When I run PE alignment, the data follows no pattern I can recognize. PE alignment is low and seemingly random. SE alignment of R2 using --pbat has an alignment rate is fine but the data doesn't make much sense to me. Would you mind taking a look? This is a repetitive region in hg38. I've also attached the unconverted amplicon.
Alright, I took a quick look but didn't go into a lot of detail.
I found that R1 and R2 can be aligned to the amplicon separately, with mapping efficiencies being fairly low (~15%) in the default mode. Efficiency increases to ~30, and 50% when the parameters are relaxed to --score_min L,0,-0.4 or -0.6. Using --local, it goes up to >80%. So there appear to be mismatches to the reference that cause the low mapping in default mode, and I believe this comes from the first 9-10 bp of each read:
Read1
Read2
Indeed, almost 100% of all reads start with these bases; not exactly sure how the experiment was designed, but these residues don't seem to align to your reference.
Indeed, either running trim_galore --clip_r1 10 --clip_r2 10 --paired *fastq.gz followed by a relaxed Bismark run produces a good amount of PE alingments. Also PE alignments in --local mode (>80%) produce alignments that are 10bt shorter on either end, when compared to SE alingments.
Shown are Read 1 and Read 2 as single-end alignments, reads on top, and a wiggle-plot below, showing that abrupt start and end of the fragments. The bottom track are local alignments. Of note, start and end are 10bp shorter, as a consequence of soft-clipping the 100% biased positions at the 5' end of R1 and R2.
Hi Felix,
I'm hoping you can help with an issue I'm having with paired end sequencing. I'm sequencing an amplicon that is 350bp long. The first ~300 bases on the forward read match what I expect. Biological and technical replicates match closely. When I run PE alignment, the data follows no pattern I can recognize. PE alignment is low and seemingly random. SE alignment of R2 using --pbat has an alignment rate is fine but the data doesn't make much sense to me. Would you mind taking a look? This is a repetitive region in hg38. I've also attached the unconverted amplicon.
Thank you very much for your help
BISMID-NTG-200-1_S0_L001_R1_001.fastq (2).gz
BISMID-NTG-200-1_S0_L001_R2_001.fastq (2).gz
BISMID-NTG-200-2_S1_L001_R1_001.fastq (2).gz
BISMID-NTG-200-2_S1_L001_R2_001.fastq (2).gz
BIS-MID.txt
BISMID-NTG-200-3_S2_L001_R1_001.fastq (1).gz
BISMID-NTG-200-3_S2_L001_R2_001.fastq (1).gz
The text was updated successfully, but these errors were encountered: