Skip to content

Releases: gbouras13/phold

v0.2.0

13 Jul 06:20
14e916c
Compare
Choose a tag to compare

You will need to re-install the updated phold database for v0.2.0 using phold install
You will also need to upgrade Foldseek to v9.427df8a

v0.2.0 is a very large update adding:

  • Improved sensitivity and faster runtime for the foldseek search. This is achieved by clustering the Phold database at --min-seq-id 0.3 -c 0.8 and creating a cluster db before running with foldseek which significantly improves runtime
    • Overall, just over 1.1M structures are clustered into around 372k clusters
  • --cluster-search 1 parameter is added to foldseek search to search against the cluster representatives first and then within each cluster, which increases sensitivity and reduces resource usage compared to phold v0.1.4
  • Changed default --max_seqs from 1000 to 10000 to improve sensitivity at little resource usage cost
  • Phold database is expanded adding:
    • Extremely conservative high confidence efam proteins with hits to PHROGs.
    • 95% dereplicated diversity-generating retroelements (DGRs) from Roux et al.
    • 7153 netflax toxin-antitoxin system proteins from Ernits et al.
  • Adds --ultra_sensitive flag which turns off Foldseek prefiltering for maximum sensitivity. Recommended for small datasets/single phages only.
    • This passes the --exhaustive-search parameter to foldseek search
  • Adds the ability to save ProstT5 embeddings with --save_per_residue_embeddings and --save_per_protein_embeddings
  • Adds .cif support (e.g. from Alphafold3 server) for structures, not just .pdb file format and changing the CLI to reflect this
  • Removes some experimental parameters from v0.1.4 (--split etc)

Breaking CLI parameter changes

  • --pdb has changed to --structures
  • --pdb_dir has changed to --structure_dir
  • --filter_pdbs has changed to --filter_structures

v0.1.4

26 Mar 00:29
ac2716e
Compare
Choose a tag to compare
  • Fixes #31 issue with older Pharokka genbank input (prior to v1.5.0) that lacked 'transl_table' field, thanks @btemperton
    • All Pharokka genbank input prior to v1.5.0 will be transl_table 11 (it is before pyrodigal-gv was added)
  • Fixes genbank parsing bug that would occur if the ID/locus tag of the CDS features in the input genbank were longer than 54 characters

v0.1.3

19 Mar 07:05
1766024
Compare
Choose a tag to compare
  • Adds compatibility with Apple Silicon (M1/M2/M3) GPUs
  • Fixes memory issue for phold plot with many contigs

v0.1.2

06 Mar 04:00
d017ef9
Compare
Choose a tag to compare
  • Fixes phold compare cds_id issue where input file was FASTA
  • Fixes issues with phold remote where input file was FASTA
  • Improved documentation with conda/mamba install

v0.1.1

05 Mar 06:51
235eef9
Compare
Choose a tag to compare
  • Restructuring for pip and conda installation

v0.1.0 Initial Release

05 Mar 00:01
e93dafb
Compare
Choose a tag to compare
  • Initial release of Phold