You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I put one chain from a PDB into my library, then run either HHBLits or HHSearch against another homologous chain with indels and the indels do not align between query and target.
Expected Behavior - indels should align
Current Behavior - indels do not align and sequence identity lower than it "obviously" would be if the indels aligned. NCBI Blast gives 97.37% sequence ID (the indels are in the right place), HHBlits says 88%.
Steps to Reproduce (for bugs)
Put sequence of chain C from 5vol into the library, run query of chain A from 5vol against it. Chain C has a leading PW at the N-terminus, and an indel from 184-190 of QGAVPAD. Chain A has a G at the C-terminus. Otherwise in all respects the two chains have 100% sequence identity.
see attached file, but the interesting bit is here - note the indel for c5volC_ (target) appears around residues 168-174, but in the query (c5volA_) appears around 196-202
Q ss_dssp CCSGGGEEEEEETHHHHHHHHHHHHTTTTCSEEEEESCCSSCCCCTTSHHHHHHHHHHHT
Q ss_pred ccchhheeecccchhHHHHHHHHhhcccccceeeeeccccCccCccccccccccccCCCC
Q c5volA_ 121 IGDRQHRAIAGLSMGGGGATNYGQRHSDMFCAVYAMSALMSIPEDPNSKIAILTRSVIEN 180 (260)
Q Consensus 121 ~~~~~~~~~~gsg~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 180 (260)
..+..++.+.|.|.|+..+...+...+..+..++..++......................
T Consensus 123 ~~~~~~~~~~GSGg~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 182 (268)
T c5volC_ 123 IGDRQHRAIAGLSMGGGGATNYGQRHSDMFCAVYAMSALMSIPEQGAVPADDPNSKIAIL 182 (268)
T ss_dssp CCSGGGEEEEEETHHHHHHHHHHHHCTTTCSEEEEESCCSSCCSSC---CCCTTSHHHHH
T ss_pred CCCCcccEEEEEccchHHHHHHHHhChHHhHHHhhccccccccccccccccccccCccch
Q ss_dssp CHHHHHHTCCHHHHH-------HHTTSEEEEECCTTCTTHHHHHHHHHHHHHTTCCCEEE
Q ss_pred chHHHHhhcchhhhh-------ccccccccccccccCccchHHHHHHHHHHHCCCcEEEE
Q c5volA_ 181 SCVKYVMEADEDRKA-------DLRSVAWFVDCGDDDFLLDRNIEFYQAMRNAGVPCQFR 233 (260)
Q Consensus 181 ~~~~~~~~~~~~~~~-------~~~~~~~~~~~~~~~~~~~~~~~~~~~L~~~g~~~~~~ 233 (260)
............... ....+++++.+++.|....++++++++|++.|+++++.
T Consensus 183 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~D~~~~~~~~~~~~l~~~g~~~~~~ 242 (268)
T c5volC_ 183 TRSVIENSCVKYVMEADEDRKADLRSVAWFVDCGDDDFLLDRNIEFYQAMRNAGVPCQFR 242 (268)
T ss_dssp HHHHHHTCHHHHHHTCCHHHHHHHTTSEEEEECCTTCTTHHHHHHHHHHHHHTTCCCEEE
T ss_pred hHHHHhcCHHHHHHhcChhhhhhccCceEEEEecCchHhHHHHHHHHHHHHHCCCCcEEE
Context
The context is that if a straightforward comparison between two homologous chains appears to give an erroneous alignment, how can I trust it for more complicated alignments with lower sequence identity?
Your Environment
Version/Git commit used: last publicly released version
Server specifications (especially CPU support for AVX2/SSE and amount of system memory): Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz (happy to upload o/p of 'more /proc/cpuinfo' if that would help), 264GB physical RAM
Operating system and version: Red Hat Enterprise Linux Workstation release 6.6 (Santiago)
I put one chain from a PDB into my library, then run either HHBLits or HHSearch against another homologous chain with indels and the indels do not align between query and target.
Expected Behavior - indels should align
Current Behavior - indels do not align and sequence identity lower than it "obviously" would be if the indels aligned. NCBI Blast gives 97.37% sequence ID (the indels are in the right place), HHBlits says 88%.
Steps to Reproduce (for bugs)
Put sequence of chain C from 5vol into the library, run query of chain A from 5vol against it. Chain C has a leading PW at the N-terminus, and an indel from 184-190 of QGAVPAD. Chain A has a G at the C-terminus. Otherwise in all respects the two chains have 100% sequence identity.
command to run:
/bmm/soft/linux64/src/hh-suite-bin/bin/hhblits -n 1 -i /bmm/www/servers/phyre2/test/hmm/test_c7xrt//c5volA_.hhblits.hhm -d /bmm/www/servers/phyre2/test/hmm/full -o /bmm/www/servers/phyre2/test/hmm/test_c7xrt//c5volA_.hhblits.hhr -b 100 -norealign -z 500 -alt 1 -aliw 60
HH-suite Output (for bugs)
see attached file, but the interesting bit is here - note the indel for c5volC_ (target) appears around residues 168-174, but in the query (c5volA_) appears around 196-202
Q ss_dssp CCSGGGEEEEEETHHHHHHHHHHHHTTTTCSEEEEESCCSSCCCCTTSHHHHHHHHHHHT
Q ss_pred ccchhheeecccchhHHHHHHHHhhcccccceeeeeccccCccCccccccccccccCCCC
Q c5volA_ 121 IGDRQHRAIAGLSMGGGGATNYGQRHSDMFCAVYAMSALMSIPEDPNSKIAILTRSVIEN 180 (260)
Q Consensus 121 ~~~~~~~~~~g
sg~~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 180 (260)..+..++.+.|.|.|+..+...+...+..+..++..++......................
T Consensus 123 ~~~~~~~~~~G
SGg~~a~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 182 (268)T c5volC_ 123 IGDRQHRAIAGLSMGGGGATNYGQRHSDMFCAVYAMSALMSIPEQGAVPADDPNSKIAIL 182 (268)
T ss_dssp CCSGGGEEEEEETHHHHHHHHHHHHCTTTCSEEEEESCCSSCCSSC---CCCTTSHHHHH
T ss_pred CCCCcccEEEEEccchHHHHHHHHhChHHhHHHhhccccccccccccccccccccCccch
Q ss_dssp CHHHHHHTCCHHHHH-------HHTTSEEEEECCTTCTTHHHHHHHHHHHHHTTCCCEEE
Q ss_pred chHHHHhhcchhhhh-------ccccccccccccccCccchHHHHHHHHHHHCCCcEEEE
Q c5volA_ 181 SCVKYVMEADEDRKA-------DLRSVAWFVDCGDDDFLLDRNIEFYQAMRNAGVPCQFR 233 (260)
Q Consensus 181 ~~~~~~~~~~~~~~~-------~~~~~~~~~~~~~~~~~~~~~~~~~~~L~~~g~~~~~~ 233 (260)
............... ....+++++.+++.|....++++++++|++.|+++++.
T Consensus 183 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~g~~D~~~~~~~~~~~~l~~~g~~~~~~ 242 (268)
T c5volC_ 183 TRSVIENSCVKYVMEADEDRKADLRSVAWFVDCGDDDFLLDRNIEFYQAMRNAGVPCQFR 242 (268)
T ss_dssp HHHHHHTCHHHHHHTCCHHHHHHHTTSEEEEECCTTCTTHHHHHHHHHHHHHTTCCCEEE
T ss_pred hHHHHhcCHHHHHHhcChhhhhhccCceEEEEecCchHhHHHHHHHHHHHHHCCCCcEEE
Context
The context is that if a straightforward comparison between two homologous chains appears to give an erroneous alignment, how can I trust it for more complicated alignments with lower sequence identity?
Your Environment
Version/Git commit used: last publicly released version
Server specifications (especially CPU support for AVX2/SSE and amount of system memory): Intel(R) Xeon(R) Gold 6326 CPU @ 2.90GHz (happy to upload o/p of 'more /proc/cpuinfo' if that would help), 264GB physical RAM
Operating system and version: Red Hat Enterprise Linux Workstation release 6.6 (Santiago)
c5volA_.hhblits.txt
The text was updated successfully, but these errors were encountered: