-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Binning question, how to use vamb? #54
Comments
Hi Chao I see, this will not fly. It looks like there is an issue in the steps where you run vamb, your contignames has to be named with a strict format that makes you able to identify the original sample origin of each sample. Maybe @enryH - Can help you identify what went wrong. |
So what is your separator? I guess you know VAMB better than me:) |
the file R63.contigs.fa.gz format like this:
|
Hi Chao - did you manage to run VAMB as described in the example on the VAMB Github repo? I think that will solve many of your issues posted here. I hope that you are trying to process more than just one sample of course.
It will be hard to help if the naming of the contigs in the fasta file and in the |
Dear@joacjo Thanks a lot ! |
Hi Chao The most straight forward to way to get vOTUs is following the approach described by the CheckV people Copied from https://bitbucket.org/berkeleylab/checkv/src/master/
Relevant ani-scripts can be found the checkv repo. |
Dear@joacjo But i have another problem: This step runs very slowly ,is there anything I can do to speed up ? I provid 16CPUs and 240G memeoy. ( #SBATCH -n 16 #SBATCH --mem=240G ) |
Hi !
Before operating phamb, i use vamb process binning,
Vamb rum mode: vamb --outdir output63
--fasta R63.contigs.fa.gz
--bamfiles R63_sort.bam
-o C
report err.log : Traceback (most recent call last):
File "/public/home/bioinfo_wang/00_software/miniconda3/envs/avamb/bin/vamb", line 33, in
sys.exit(load_entry_point('vamb', 'console_scripts', 'vamb')())
File "/public/home/bioinfo_wang/00_software/vamb/vamb/main.py", line 1395, in main
run(
File "/public/home/bioinfo_wang/00_software/vamb/vamb/main.py", line 834, in run
cluster(
File "/public/home/bioinfo_wang/00_software/vamb/vamb/main.py", line 665, in cluster
clusternumber, ncontigs = vamb.vambtools.write_clusters(
File "/public/home/bioinfo_wang/00_software/vamb/vamb/vambtools.py", line 440, in write_clusters
for clustername, contigs in clusters:
File "/public/home/bioinfo_wang/00_software/vamb/vamb/vambtools.py", line 701, in binsplit
for newbinname, splitheaders in _split_bin(binname, headers, separator):
File "/public/home/bioinfo_wang/00_software/vamb/vamb/vambtools.py", line 676, in _split_bin
raise KeyError(f"Separator '{separator}' not in sequence label: '{header}'")
KeyError: "Separator 'C' not in sequence label: 'k141_84347'"
But, the reuslt contain ‘k141_84347 ’ :
‘ less contignames |grep "k141_84347" -A2 -B2 ' --> 'k141_512747
k141_170723
k141_84347
k141_170724
k141_512748'
the vamb operation result file contain :
'0 Oct 9 23:52 vae_clusters.tsv # why the file is empty?
7.7M Oct 9 23:52 contignames
2.6M Oct 9 23:52 lengths.npz
41K Oct 9 23:52 log.txt
77M Oct 9 23:52 latent.npz
815K Oct 9 23:51 model.pt
894 Oct 9 14:40 mask.npz
2.3M Oct 9 14:40 abundance.npz
252M Oct 9 14:38 composition.npz'
Thanks!
The text was updated successfully, but these errors were encountered: