PacBio sequence data also permits the analysis of methylated bases within the sequence, which can be extremely helpful to the scientific community. For example, the precise positions of those modified bases can be used to determine the specificity of the DNA methyltransferases that produced them. The PacBio analysis suite contains an analysis workflow (RS_Modification_and_Motif_Analysis) to extract these sequences and produce several files:
- motif_summary.csv
- modifications.csv
- modifications.gff
- motifs.gff
It would be beneficial to the scientific community if you were able to perform this analysis and submit at least the motif_summary.csv file for prokaryotes via one of these routes:
- You can upload the files when you submit the WGS genome to GenBank at https://submit.ncbi.nlm.nih.gov/subs/wgs
- After you've submitted the genome to GenBank, you can email the motif_summary.csv file to us. Be sure to include the genome's accession number or submission ID
Links to the public basemodification files can be found in this file on the FTP site, ftp://ftp.ncbi.nlm.nih.gov/pub/supplementary_data/basemodification.csv