-
Notifications
You must be signed in to change notification settings - Fork 157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get original SEQRES? #1949
Comments
Hello, In [19]: p = parsePDB('1BKV') In [20]: checkNonstandardResidues(p.select('name CA')) |
Actually, I did it already. It is useful to have resid as well. In [1]: from prody import * In [2]: p = parsePDB('1BKV') In [3]: checkNonstandardResidues(p.select('name CA')) |
I think we probably should be able to parse the seqres and return the 3 character version, but .getSequence is probably not the answer. In the 1 letter version, we do alias them as the nearest standard amino acid and also there’s a small annoyance that you get the sequence repeated across for every atom unless you use a Chain object or select ca atoms. We’ll have a look but think this shouldn’t be too hard |
For what it's worth, we do have this:
|
ok, I have a solution at a new pull request that I'm about to make. It will give you a new keyword argument threeLetter than you can use to get your desired behaviour:
|
ok, I think this is now essentially ready. Please feel free to look over the pull request and let us know if we can add anything else and to check out the branch jamesmkrieger:seqres_threeLetter and try it. |
@lokapal please let us know if these comments and changes meet your requirements |
All is fixed right now, as it seems to me. Thank you very much! |
You’re welcome |
I need in SEQRES in the original non-translated to single-letter code, because I need to find non-standard aminoacids that are presenting in some proteins (more frequently they are hydroxyproline and hydroxylysine). I can detect them by scanning .resname from chemical and search for them into original SEQRES. Right now it doesn't work obviously.
I'd like to get in SEQRES the list of strings or at least the one big string with three-letters codes and spaces. Are there means to do that?
Moreover, your forced conversion makes errors:
1BKV:
prot.sequence:
We know from MODRES, that oxyprolines are at the following places:
ProDy replaces them ALL by normal Proline, that is error obviously
The text was updated successfully, but these errors were encountered: