Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while reading old trained models and dimer descriptors issue #627

Open
OlgaChalykh opened this issue Dec 16, 2023 · 5 comments
Open

Comments

@OlgaChalykh
Copy link

Dear QUIP developers,

I am going into details of Max Veit's paper https://pubs.acs.org/doi/abs/10.1021/acs.jctc.8b01242 and exploring the archives with the trained models, datasets, and training scripts published at https://www.repository.cam.ac.uk/items/f8cfd6c4-4323-4d29-b05d-177928150a45.

It looks like QUIP commands have changed since the paper publication. Regarding this, I have some questions:

  1. when I am trying to use any trained model, I get error:
    SYSTEM ABORT: Potential_read_params_xml: could not initialize potential from xml_label
    Can I modify trained models files to use them?

  2. I was trying to modify the training script for 6-D dimer GAP. I excluded core_param_file and excluded substituted general_dimer descriptor with A2_dimer keeping other parameters and training set the same, but got error:

SYSTEM ABORT: Traceback (most recent call last)
File "/project/src/libAtoms/Topology.f95", line 2536 kind unspecified
Cannot find pair for atom index 2           
STOP 1

Seems like it is a problem with monomer cutoff, but when I increased those parameters it did not solve the problem. Unfortunately, I did not find detailed instructions on how to work with dimer descriptors. How can I solve this problem?

Training script I use:

!gap_fit at_file=./bulk-methane-fit-dimer/repo-fit-dimer/me-rigid-shortaug3-gscc.xyz \
gap={A2_dimer cutoff=6.0 \
     cutoff_transition_width=1.0 \
     signature_one={6 1 1 1 1} \
     signature_two={6 1 1 1 1} \
     monomer_one_cutoff=1.5 \
     monomer_two_cutoff=1.5 \
     atom_ordercheck=F \
     strict=F\
     mpifind=T \
     theta_uniform=1.0 \
     covariance_type=ARD_SE \
     n_sparse=2000\
     delta=0.02 \
     sparse_method=CUR_COVARIANCE } \
default_sigma={0.0002 0.002 0.0 0.0}\
sparse_jitter=1e-10 \
energy_parameter_name=energy \
force_parameter_name=force \
e0=0.0 \
gp_file=./models/gp.xml \
do_copy_at_file=F

The original training script from paper's supplementary material:

teach_sparseat_file=me-rigid-shortaug3-mp2-avqz-intnonan.xyz core_param_file={../empirical-pots/ljrep_quip_params.xml} core_ip_args={IPLJ} gap={ general_dimercutoff=6.0 cutoff_transition_width=1.0 signature_one={{61 11 1}} signature_two={{6 11 1 1}} monomer_one_cutoff=1.5monomer_two_cutoff=1.5atom_ordercheck=F strict=Fmpifind=T theta_uniform=1.0covariance_type=ARD_SE n_sparse=2000delta=0.02 sparse_method=CUR_COVARIANCE } default_sigma={0.0002 0.0020.0 0.0}sparse_jitter=1e-10 energy_parameter_name=energyforce_parameter_name=forcee0=0.0 gp_file=gp-merig-mp2-gendim-shortaug3.xml do_copy_at_file=F

@gabor1
Copy link
Contributor

gabor1 commented Dec 16, 2023

the original command with correct spaces is this:

teach_sparse at_file=$AT_FILE core_param_file={ljrep_quip_params.xml} ip_args={IP LJ} descriptor_str={ \
    general_dimer cutoff=6.0 cutoff_transition_width=1.0 monomer_one_cutoff=1.5 monomer_two_cutoff=1.5 atom_ordercheck=F strict=F signature_one={{6 1 1 1 1}} signature_two={{6 1 1 1 1}} \
        theta_uniform=1.0 covariance_type=ARD_SE n_sparse=2000 delta=0.02} \
    default_sigma={0.002 0.02 0.0} sparse_jitter=1e-10 gp_file=$GP_FILE do_copy_at_file=F sparse_separate_file=T\
    energy_parameter_name=energy force_parameter_name=force e0=0.0

I thin this should work , if you place the ljrep_quip_params.xml file in the current directory. It is quite possible that the missing spaces were the problem. For example, the core_ip_args has to be {IP LJ}, not {IPLJ}

The error you got says that although you asked to look for methane dimers, in which each of the 4 Hs are within 1.5A of the C, it could not find one of the Hs. are you sure your initial geometry file is correct? can you post your initial structure here?

@OlgaChalykh
Copy link
Author

When I run the original command with corrected spaces (substituted teach_sparse with gap_fit because got error param_read_line: unknown key teach_sparse),

!gap_fit at_file=./datasets/dimer_dataset.xyz core_param_file={ljrep_quip_params.xml} ip_args={IP LJ} descriptor_str={ \
general_dimer cutoff=6.0 cutoff_transition_width=1.0 monomer_one_cutoff=1.5 monomer_two_cutoff=1.5 atom_ordercheck=F strict=F signature_one={{6 1 1 1 1}} signature_two={{6 1 1 1 1}} \
theta_uniform=1.0 covariance_type=ARD_SE n_sparse=2000 delta=0.02} \
default_sigma={0.002 0.02 0.0} sparse_jitter=1e-10 gp_file=gap.xml do_copy_at_file=F sparse_separate_file=T\
energy_parameter_name=energy force_parameter_name=force e0=0.0

I get error:

param_read_line: unknown key ip_args
SYSTEM ABORT: Exit: Mandatory argument(s) missing...   
STOP 1

With ljrep_quip_params.xml potential used as core_param_file in the command above (as well as with any other trained model I got from here ) when I run:

from quippy.potential import Potential
gap = Potential(param_filename='./bulk-methane-fit-dimer/repo-fit-dimer/ljrep_quip_params.xml')

I get error
Potential_read_params_xml: could not initialise potential from xml_label. param_str=<!-- Parameters from the L-J plus repulsion model --> <LJREP> <Potential label="ljrep" init_args="IP LJ"/> <LJ_params n_types="2" only_inter_resid="T" label="ljrep-lj"> <per_type_data type="1" atomic_num="1" /> <per_type_data type="2" atomic_num="6" /> <per_pair_data type1="2" type2="2" sigma="3.52608449" eps6="0.0540000776" eps12="0.0540000776" cutoff="14.0" energy_shift="F" linear_force_shift="F" /> <per_pair_data type1="1" type2="2" sigma="1.0" eps6="0.0" eps12="517.030" cutoff="10.0" energy_shift="F" linear_force_shift="F" /> <per_pair_data type1="1" type2="1" sigma="1.0" eps6="0.0" eps12="23.4878" cutoff="10.0" energy_shift="F" linear_force_shift="F" /> </LJ_params> </LJREP>


One of structures from the dataset I used to fit gap with A2_dimer descriptor:

10
energy=-0.00998655 ediff_cc=-0.00109638 cutoff=-1.00000000 nneightol=1.20000000 pbc="T T T" Lattice="30.00000000       0.00000000       0.00000000       0.00000000      30.00000000       0.00000000       0.00000000       0.00000000      30.00000000" Properties=species:S:1:pos:R:3:Z:I:1:resid:I:1:force:R:3
C               0.67655467      1.53558322     -1.64751401       6     141      0.00152822     -0.00057094     -0.00049239
H               0.62038120      1.03628756     -0.68231701       1     141     -0.00331393     -0.00627386      0.00891613
H               0.38843306      0.83959819     -2.43258845       1     141     -0.00145188     -0.00151616     -0.00092832
H               1.69528466      1.87629793     -1.82028809       1     141      0.00139596      0.00108906     -0.00065731
H               0.00216686      2.38948973     -1.65395806       1     141     -0.00122184      0.00083637     -0.00013244
C              -0.67655467     -1.53558322      1.64751401       6     148      0.00278940      0.00189125      0.00103092
H              -0.15292844     -1.79565316      2.56522617       1     148      0.00099199      0.00006716      0.00114083
H               0.00113054     -1.63653049      0.80215612       1     148      0.00310865     -0.00134432     -0.00668266
H              -1.02845731     -0.50776289      1.70952859       1     148     -0.00314982      0.00507783     -0.00042915
H              -1.52591635     -2.20229213      1.51312161       1     148     -0.00067677      0.00074360     -0.00176559

Probably the dataset has no problems because I can successfuly train on it this model:

!gap_fit at_file=./bulk-methane-fit-dimer/repo-fit-dimer/me-rigid-shortaug3-gscc.xyz \
gap={distance_Nb order=2 \
                 cutoff=10.0 \
                 covariance_type=ARD_SE \
                 theta_uniform=1.0 \
                 n_sparse=15 \
                 delta=1.0} \
e0=0.0 \
default_sigma={0.01 0.5 0.0 0.0} \
do_copy_at_file=F sparse_separate_file=F \
gp_file=./models/gap_2b.xml

@gabor1
Copy link
Contributor

gabor1 commented Dec 17, 2023

yes, that is a change, replace "ip_args" with "core_ip_args".

a single frame of your training data doesn't help, because ONE of your frames triggered the error you reported, but we don't know which. can you send your entire training set? (or a subset of it which triggers the error about the atom not being found)

@gabor1
Copy link
Contributor

gabor1 commented Dec 17, 2023

the reason the simple training with the distance_Nb descriptor works is because it is not trying to find methane molecules.

@OlgaChalykh
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants