You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello lucidrain, firstly thank you very much for the pytorch reproduction of alphafold3.
In file inputs.py
function extract_canonical_molecules_from_biomolecule_chains
in some cif_file, certain key ion coordinates are set to the origin, which causes this line of code res_atom_positions = atom_positions[res_ligand_atom_mask]
can not get the corresponding result correctly, its return res_atom_elements is null.
when use in later function create_mol_from_atom_positions_and_types, it raise an exception ValueError: The length of atom_elements and xyz_coordinates must be the same.
You can reproduce this problem using the 1qyl_assembly1.cif file as input, which has two vanadium ions that are each at the origin with 25% probability.
The text was updated successfully, but these errors were encountered:
Hi, @wanggaa. Thanks for your kind words on this project!
My intuition tells me that this issue is caused by said ions having a zero vector for their coordinates, as one can see is possible in the construction of Biomolecule objects here (which are subsequently used to build PDBInputs -> MoleculeInputs -> AtomInputs):
For such ions, their singular atom_mask value is 1, even though their coordinates may be all zeros. Subsequently, as you noticed, in extract_canonical_molecules_from_biomolecule_chains, we filter for only the atom elements in a given molecule that are associated with an atom possessing non-null coordinates. This is what usually causes these element count-coordinates count mismatches to get caught later on, which then cause the PDB structure to be "rejected" by our dataloader and replaced with another example for training/validation.
I've noticed this occurrence for other PDB IDs, and in such cases, it's an open question how best to handle such PDB structures. Ideas or pull requests for better ways to handle such edge cases are very much welcome.
Hello lucidrain, firstly thank you very much for the pytorch reproduction of alphafold3.
In file
inputs.py
function
extract_canonical_molecules_from_biomolecule_chains
in some cif_file, certain key ion coordinates are set to the origin, which causes this line of code
res_atom_positions = atom_positions[res_ligand_atom_mask]
can not get the corresponding result correctly, its return
res_atom_elements
is null.when use in later function
create_mol_from_atom_positions_and_types
, it raise an exceptionValueError: The length of atom_elements and xyz_coordinates must be the same.
You can reproduce this problem using the 1qyl_assembly1.cif file as input, which has two vanadium ions that are each at the origin with 25% probability.
The text was updated successfully, but these errors were encountered: