MDBENCHGNN/preprocessing at main · M3RG-IITD/MDBENCHGNN

History

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
check_keys_n_shape.py		check_keys_n_shape.py
npz_to_xyz.py		npz_to_xyz.py
xyz_to_npz.py		xyz_to_npz.py

README.md

Data formats used in this package: I) The .npz format is a file format used by NumPy. It can store multiple arrays within a single .npz file, making it convenient for bundling related data together. Each array in the .npz file is associated with a unique key, which can be used to access the specific array when loading the data.

II) The Extended XYZ (.xyz) format is a file format commonly used for representing molecular structures and atomic coordinates. It provides a simple and flexible way to store atomic positions, element types, and additional properties associated with each atom. Each frame can represent a different time step, molecular conformation, or related structure.

NPZ and XYZ can be interconverted b/w each other.

Information needed:

'atomic_numbers'
'cell'
'forces'
'energy'
'positions'
'pbc' #preriodic boundary condition in all 3 directions

example of accepted data: present at example/lips/data =============Lips data info============ Keys for data: key: pbc with shape (3,) key: pos with shape (1000, 83, 3) key: energy with shape (1000, 1) key: forces with shape (1000, 83, 3) key: cell with shape (3, 3) key: atomic_numbers with shape (83,)

NOTE:

Nequip, Allegro, Equiformer and TorchmdNET can work with just atomic numbers, forces, energy and positions.
By default the models use the xyz file name as botnet.xyz and npz as nequip_npz.npz

All the models: Nequip, Allegro, Equiformer, torchmdNET, Mace and Botnet support both npz and xyz format for training

converting XYZ to NPZ: run file xyz_to_npz

CONVERTING NPZ to XYZ: example: python preprocessing/npz_to_xyz.py --input-filepath example/lips/data/val/nequip_npz.npz will create an xyz file at example/lips/data/val/botnet.xyz

IMP NOTE: In case facing some error in data conversion:

get data in npz format and cross check with example/lips/data/train/nequip_npz.npz
Then create test, train and val split like we have here example/lips/data
Then convert npz to xyz

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

preprocessing

preprocessing

README.md

Files

preprocessing

Directory actions

More options

Directory actions

More options

Latest commit

History

preprocessing

Folders and files

parent directory

README.md