You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, we are trying to add customed barcode sequences right next to RT primer for our dRNA-Seq library with RNA004 kit. The barcode sequence is directly ligated with RT primer sequences.
here's the structure of our sequence:
-------reads-------AAAAAA--barcodes--primer--adapter
We using Dorado for basecalling, adapters, primers, and the barcodes are all trimmed and Dorado documentation says this procedure is mandatory for cDNA and RNA libraries. The code I used for basecalling are:
dorado basecaller -x auto --no-trim --barcode-arrangement bc-arrangement.toml --barcode-sequences bc.fa [email protected] ./pod5/ > test.bam this doesn't work cuz the barcodes are all trimmed and no any barcode information in the output bam file
below is my arrangement file:
name = "custom_barcode"
kit = "BC"
mask sequences are the flanking four bases of my barcode sequence since I have long enough barcode.
below is my bacorde sequence file:
BC01
CACAxxxxxxxxxxxTCTT
BC02
ACAGxxxxxxxxxxxTCGA
I was expected to see demultiplexed reads separated by the two barcode sequences at the 3' end of the reads;
But after basecalling, barcodes are trimmed and sequences are not classified by the barcode information I provided.
Does Dorado support demultiplexing on 3' end barcodes only? and why Dorado trim the barcode sequences as well instead of trimming primer and sequencing adapter only? Dose Dorado detect the poly A signals and trim the rest sequences after polyA tail?
Dorado version: 0.7.0+71cc744
Thanks!
The text was updated successfully, but these errors were encountered:
For custom barcodes only at the 3' end, you need to include the setting:
[arrangement]
rear_only_barcodes = true
in the arrangement toml file.
Are your barcode/primer bases RNA or DNA? For RNA basecalling, dorado automatically trims any DNA signal at the 3' end since the RNA basecall model is highly unlikely to give accurate basecalls on DNA.
thanks @malton-ont ! yes we did use DNA oligos for barcoding. Is it possible to baseball with RNA model and DNA model separately to get the RNA sequence and the DNA barcode sequence? if yes, do you have any recommended DNA model to do this?
Hi @YOUZhen93 , this is unfortunately a bit more involved than simply basecalling it with different basecallers (which doesn't work as there's no DNA model for RNA pores) but you can check out ADAPTed (made by my amazing colleague Wiep van der Toorn) to extract the DNA portion of your signal (including the variable signal of your barcodes), and then you can explore some sort of clustering/classification based on the raw signal. This is also how we pre-process the data for our dRNA multiplexing method WarpDemuX.
Hi, we are trying to add customed barcode sequences right next to RT primer for our dRNA-Seq library with RNA004 kit. The barcode sequence is directly ligated with RT primer sequences.
here's the structure of our sequence:
-------reads-------AAAAAA--barcodes--primer--adapter
We using Dorado for basecalling, adapters, primers, and the barcodes are all trimmed and Dorado documentation says this procedure is mandatory for cDNA and RNA libraries. The code I used for basecalling are:
dorado basecaller -x auto --no-trim --barcode-arrangement bc-arrangement.toml --barcode-sequences bc.fa [email protected] ./pod5/ > test.bam this doesn't work cuz the barcodes are all trimmed and no any barcode information in the output bam file
below is my arrangement file:
name = "custom_barcode"
kit = "BC"
mask1_front = "CACA"
mask1_rear = "TCTT"
mask2_front = "ACAG"
mask2_rear = "TCGA"
Barcode sequences
barcode1_pattern = "BC%02i"
barcode2_pattern = "BC%02i"
first_index = 1
last_index = 96
Scoring options
[scoring]
min_soft_barcode_threshold = 0.2
min_hard_barcode_threshold = 0.2
min_soft_flank_threshold = 0.3
min_hard_flank_threshold = 0.3
min_barcode_score_dist = 0.1
mask sequences are the flanking four bases of my barcode sequence since I have long enough barcode.
below is my bacorde sequence file:
I was expected to see demultiplexed reads separated by the two barcode sequences at the 3' end of the reads;
But after basecalling, barcodes are trimmed and sequences are not classified by the barcode information I provided.
Does Dorado support demultiplexing on 3' end barcodes only? and why Dorado trim the barcode sequences as well instead of trimming primer and sequencing adapter only? Dose Dorado detect the poly A signals and trim the rest sequences after polyA tail?
Dorado version: 0.7.0+71cc744
Thanks!
The text was updated successfully, but these errors were encountered: