Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with using custom phold database #45

Open
rozwalak opened this issue Jun 27, 2024 · 0 comments
Open

Problem with using custom phold database #45

rozwalak opened this issue Jun 27, 2024 · 0 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@rozwalak
Copy link

  • phold version: 0.1.4
  • Python version: 3.11
  • Operating System: Linux

Description

Hi,
I'm trying to make my own database using phold createdb and run some protein sequences against this newly create db.

What I Did

I started with "phold proteins-predict" for my custom protein set. Next, I ran "phold createdb" using the output from the previous step. Finally, I tried to compare my protein sequences (after proteins-predict) with the newly created custom database.

However, I encountered a warning:
"Phold Database file my_foldseek_db/phold_foldseek_db/all_phold_prostt5 is missing."

And an error:
"Phold database not found. Please run phold install -d my_foldseek_db/ to download and install the phold database."

So, I ran "phold install -d my_foldseek_db/" but it resulted in the installation of the full phold database instead of creating a usable custom database as I expected.

Moreover, is it possible to create my own database based only on Prost5, without PDB structures?

Below is the log from "phold createdb." It seems that the database was created correctly.

2024-06-26 06:34:10.652 | INFO     | phold.utils.util:begin_phold:72 - phold: annotating phage genomes with protein structures
2024-06-26 06:34:10.653 | INFO     | phold.utils.util:begin_phold:74 - You are using phold version 0.1.4
2024-06-26 06:34:10.653 | INFO     | phold.utils.util:begin_phold:75 - Repository homepage is https://github.com/gbouras13/phold
2024-06-26 06:34:10.654 | INFO     | phold.utils.util:begin_phold:76 - You are running phold createdb
2024-06-26 06:34:10.654 | INFO     | phold.utils.util:begin_phold:77 - Listing parameters
2024-06-26 06:34:10.654 | INFO     | phold.utils.util:begin_phold:79 - Parameter: --fasta_aa phold_08_phrogs_train/phold_aa.fasta
2024-06-26 06:34:10.654 | INFO     | phold.utils.util:begin_phold:79 - Parameter: --fasta_3di phold_08_phrogs_train/phold_3di.fasta
2024-06-26 06:34:10.654 | INFO     | phold.utils.util:begin_phold:79 - Parameter: --output my_foldseek_db
2024-06-26 06:34:10.655 | INFO     | phold.utils.util:begin_phold:79 - Parameter: --threads 1
2024-06-26 06:34:10.655 | INFO     | phold.utils.util:begin_phold:79 - Parameter: --force False
2024-06-26 06:34:10.655 | INFO     | phold.utils.util:begin_phold:79 - Parameter: --prefix phold_foldseek_db
2024-06-26 06:34:10.853 | INFO     | phold.utils.validation:check_dependencies:117 - Foldseek version found is v8.ef4e960
2024-06-26 06:34:10.853 | INFO     | phold.utils.validation:check_dependencies:126 - Foldseek version is ok
2024-06-26 06:34:10.854 | INFO     | phold:createdb:1048 - Creating the Foldseek database using phold_08_phrogs_train/phold_aa.fasta and phold_08_phrogs_train/phold_3di.fasta.
2024-06-26 06:34:10.854 | INFO     | phold:createdb:1049 - The database will be saved in the my_foldseek_db directory and be called phold_foldseek_db.
2024-06-26 06:34:17.875 | INFO     | phold.utils.external_tools:run:44 - Started running foldseek tsv2db my_foldseek_db/aa.tsv my_foldseek_db/phold_foldseek_db --output-dbtype 0 ...
2024-06-26 06:34:18.378 | INFO     | phold.utils.external_tools:run:46 - Done running foldseek tsv2db my_foldseek_db/aa.tsv my_foldseek_db/phold_foldseek_db --output-dbtype 0
2024-06-26 06:34:18.382 | INFO     | phold.utils.external_tools:run:44 - Started running foldseek tsv2db my_foldseek_db/3di.tsv my_foldseek_db/phold_foldseek_db_ss --output-dbtype 0 ...
2024-06-26 06:34:18.905 | INFO     | phold.utils.external_tools:run:46 - Done running foldseek tsv2db my_foldseek_db/3di.tsv my_foldseek_db/phold_foldseek_db_ss --output-dbtype 0
2024-06-26 06:34:18.909 | INFO     | phold.utils.external_tools:run:44 - Started running foldseek tsv2db my_foldseek_db/header.tsv my_foldseek_db/phold_foldseek_db_h --output-dbtype 12 ...
2024-06-26 06:34:19.195 | INFO     | phold.utils.external_tools:run:46 - Done running foldseek tsv2db my_foldseek_db/header.tsv my_foldseek_db/phold_foldseek_db_h --output-dbtype 12
2024-06-26 06:34:19.347 | INFO     | phold.utils.util:end_phold:101 - phold createdb has finished
2024-06-26 06:34:19.348 | INFO     | phold.utils.util:end_phold:102 - Elapsed time: 8.71 seconds
@gbouras13 gbouras13 added enhancement New feature or request question Further information is requested labels Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants