-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add threeLetter SEQRES #1950
base: main
Are you sure you want to change the base?
add threeLetter SEQRES #1950
Conversation
ok, now the CIF version is working too:
|
This now works with chain getSequence with allres=True too:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works fine. Check.
In [1]: from prody import *
In [2]: ag = parsePDB('1bkv', compressed=False)
@> Connecting wwPDB FTP server RCSB PDB (USA).
@> 1bkv downloaded (1bkv.pdb)
@> PDB download via FTP completed (1 downloaded, 0 failed).
@> 692 atoms and 1 coordinate set(s) were parsed in 0.02s.
In [3]: for chain in ag.iterChains():
...: print(chain.getSequence(threeLetter=True, allres=True))
...:
HYP GLY PRO HYP GLY PRO HYP GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH
PRO HYP GLY PRO HYP GLY PRO HYP GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY ACY ACY ACY ACY HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH
PRO HYP GLY PRO HYP GLY PRO HYP GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY ACY HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH HOH
In [4]: for chain in ag.iterChains():
...: print(chain.getSequence(threeLetter=True))
...:
GLY PRO GLY PRO GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO GLY PRO GLY PRO GLY PRO GLY
PRO GLY PRO GLY PRO GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO GLY PRO GLY PRO GLY PRO GLY
PRO GLY PRO GLY PRO GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO GLY PRO GLY PRO GLY PRO GLY
In [5]: polys = parseCIFHeader('1bkv', 'polymers', threeLetter=True)
@> Connecting wwPDB FTP server RCSB PDB (USA).
@> 1bkv downloaded (1bkv.cif)
@> PDB download via FTP completed (1 downloaded, 0 failed).
In [6]: for prot in polys:
...: ...: SEQRES=prot.sequence
...: ...: print (SEQRES)
...:
PRO HYP GLY PRO HYP GLY PRO HYP GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY
PRO HYP GLY PRO HYP GLY PRO HYP GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY
PRO HYP GLY PRO HYP GLY PRO HYP GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY PRO HYP GLY
In [7]: ag.ca.getSequence(threeLetter=True)
Out[7]: 'GLY PRO GLY PRO GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO GLY PRO GLY PRO GLY PRO GLY PRO GLY PRO GLY PRO GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO GLY PRO GLY PRO GLY PRO GLY PRO GLY PRO GLY PRO GLY ILE THR GLY ALA ARG GLY LEU ALA GLY PRO GLY PRO GLY PRO GLY PRO GLY'
Actually, I should probably call it something different and I think I still need to add it to docs and make some tests It should hopefully work for 4 letter resnames like TIP3 or HISE if we parse them like that, but I need to check |
ok, this doesn't completely work for 4 characters like HISE, because they don't get selected in chain.calpha |
now it works after adding HISE and various others from the InSty list to NONSTANDARD in atomic.flags
|
fixes #1949
So far this works for pdb headers and atomic object only. I'm still working on the cif header part
What we can do so far is the following: