Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Novoplasty Completed Successfully but Did Not Produce Contigs in Output File #231

Open
meeranhussain opened this issue Jul 29, 2024 · 16 comments

Comments

@meeranhussain
Copy link

I recently used Novoplasty to assemble the mitogenome from short read data of Microtonus aethiopoides ecotypes. Although the process completed successfully, it did not produce any contigs. I initially assembled the mitogenomes using Flye on Oxford Nanopore Technology (ONT) long read data for eight samples and obtained circularized genomes with sizes ranging from 29-32kb, which is unusually large for insect mitogenomes. To validate these results, I tried using Novoplasty on the short read data from the same samples. Despite the successful run, Novoplasty did not generate any contigs. I expected Novoplasty to produce contigs to compare with the Flye assembly results. I also wrote in Biostar (https://www.biostars.org/p/9599074/) to find answers for large mitogenomes but didn't find useful suggestions to validate. I would appreciate any suggestions!

image

image

@ndierckx
Copy link
Owner

ndierckx commented Aug 5, 2024

Insects can have a long repetitive control region, so those lengths can be possible.
Can you send me that extended log file? Seems the assembly was already 25 kbp, not sure what went wrong
But it seems you have a long repetitive region so to have an accurate length of that region, best to rely on the Nanopore reads

@meeranhussain
Copy link
Author

Hi, thanks for your reply. This genome appears to have long repetitive regions, but I am also concerned about potential misassemblies. I say this because I verified long-read mitogenome assembly method on Calliphora sp ONT data (whose mitogenome is typically 15-16kb). However, using Flye with this method resulted in a 32kb circular contig, which raises concerns about misassemblies. Any suggestions you have would be helpful. I also tried NOVOPlasty with a small k-mer value, but it still didn't produce a circular contig. I’ve attached the log file for your reference.
log_extended_Maethio_13 (1).txt

@ndierckx
Copy link
Owner

ndierckx commented Aug 7, 2024

At least this assembly outputted the assembled sequence, but it is probably not possible to accurately assemble the complete genome with just short reads. Do you also have long reads for this sample?

@meeranhussain
Copy link
Author

Yes, I did try assemble using ONT reads but gave with long 32kb contig, with lot of repeats in control region, which I think is because of misassembly.

@ndierckx
Copy link
Owner

ndierckx commented Aug 7, 2024

If you have short and long reads from the species (preferably same sample), it should be easy to assemble. I do have an unpublished hybrid assembler that I used for another user before: https://www.mdpi.com/1422-0067/24/4/3976
Can't share the code yet but maybe I could run it for you

I have a new long read assembler I just put online, which works much better than Flye. I will create a Docker in the future because Perl modules can be annoying to install on a cluster, but it doesn't need any memory so maybe you can run it on your desktop or laptop: https://github.com/ndierckx/NOVOLoci

@meeranhussain
Copy link
Author

Thanks, that's so nice of you but I would first like to try your long read assembly (NOVOLoci), if it still doesn't work then will comeback to you for hybrid method

@meeranhussain
Copy link
Author

Hi,
Back after huge gap. Tried NOVOLoci but it's taking long time to run (still on run) will update after it's done.

As an alternative method, I used SPAdes in meta mode and obtained a contig of 15,324 bp with high coverage (~700x). After annotating this contig with MITOS2, all genes were successfully identified. Interestingly, this result aligns well with my NOVOPlasty assembly. However, I am unsure how to confirm whether the contig is circular, identify the circular break points, or how to make it circular.
Your guidance on these aspects would be extremely helpful.

@ndierckx
Copy link
Owner

ndierckx commented Dec 2, 2024

Hi, If the long read assemblies were around 32 kbp, I think the 15k one is incomplete.
The NOVOPlasty one had already 25 kbp so...
The control region can be full of tandem duplications so Spades will probaply truncate them, it usually does that.

If you need help in trying NOVOLoci, let me know, the version online was just a beta version, will soon upload a new one with manual.

Are the ONT and Illumina reads from the same sample?

@meeranhussain
Copy link
Author

meeranhussain commented Dec 2, 2024

Yes, ONT and Illumina are from same sample. Definitely, need your help to get it assembled. Would it be possible to share the code of new tool? Or how do you suggest we proceed from here?

@ndierckx
Copy link
Owner

ndierckx commented Dec 2, 2024

Probably would work with the hybrid assembler, but never made it ready for public use (need to make changes in the code to run)
If I find the time in the future, I will add a long read module to NOVOPlasty, but won't be any soon.

The easiest would be that you send me the reads and I will run it. If the assembly works from the first try, it won't be much work for me

@meeranhussain
Copy link
Author

Sounds good. Can you share your email Id?

@ndierckx
Copy link
Owner

ndierckx commented Dec 2, 2024

nicolasdierckxsensathotmaildotcom

@meeranhussain
Copy link
Author

Thanks, I have emailed you.

@meeranhussain
Copy link
Author

Hi,

Can you confirm if you were able to access the files?

@ndierckx
Copy link
Owner

Hi, sorry just downloaded them today, the mail ended up in my spam so didn't noticed it before. Will run it tomorrow and let you know how it went

@meeranhussain
Copy link
Author

Perfect! Thanks for the update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants