Sparse configurations in the index file and specifying custom sparse configurations #378

salmanak31 · 2021-12-22T22:29:53Z

salmanak31
Dec 22, 2021

Hi,

I have been playing around with a few tutorials from the quip gap documentation and some examples of my own. I was having trouble understanding the format in the generated index file. I ran a short MD run (ca. 10 ps) for a gas phase Al5 cluster and selected 86 random frames from the trajectory (train.xyz attached). I understand that 86 structures are not enough to get an accurate FF, but I am just playing around with the code at the moment. I use the following code to fit a soap gap potential.

gap_fit at_file=train.xyz \
gap={ \
    soap \
    cutoff=4.5 \
    l_max=8 \
    sparse_method=cur_points \
    atom_sigma=0.5 \
    delta=0.2 \
    n_max=8 \
    cutoff_transition_width=0.5 \
    zeta=2 \
    n_sparse=100 \
    covariance_type=dot_product \
    add_species \
     \
  } \
  force_parameter_name=forces \
  energy_parameter_name=energy \
  default_sigma={0.001 0.001 0.001 0} \
  e0={Al:-2.1} \
  gp_file=gap_2b3bsoap.xml 2>&1

I get the fit potential along with an index file, attached here (train.xyz.idx). From my understanding, in the index file the first line should correspond to the total number of configurations, which is 86 in this case and that makes sense. What do the numbers in the first column of the following lines correspond to? I was not sure what these numbers correspond to as all the numbers are greater than the total number of atoms in the training file ((total number of configurations)*(atoms per configuration)). From the description in the documentation, I thought that the numbers should correspond to the index of the atom position in the training file. For example, hypothetically, if there were two structures with 5 Al atoms each in the training file as follows:

Lattice="20.0 0.0 0.0 0.0 20.0 0.0 0.0 0.0 20.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=-10.25134266 stress="7.913609420707385e-05 -4.472665439607943e-05 -3.0396149443051467e-06 -4.472665439607943e-05 0.0004114465230873502 -4.484524306947121e-05 -3.0396149443051467e-06 -4.484524306947121e-05 0.00030529717738345374" free_energy=-10.27955223 pbc="T T T"
Al 6.37574000 3.88201000 10.00810000 -0.17088500 0.42584600 -0.14688600
Al 6.22956000 4.68964000 7.48749000 -0.08112800 0.53238800 0.34919500
Al 6.28245000 6.53920000 11.66315000 -0.09798200 -0.24713600 -0.38576900
Al 5.43052000 7.49000000 7.89799000 0.00224300 -1.00238500 0.44014900
Al 4.53842000 5.78057000 9.96273000 0.34694800 0.29356700 -0.24343800
5
Lattice="20.0 0.0 0.0 0.0 20.0 0.0 0.0 0.0 20.0" Properties=species:S:1:pos:R:3:forces:R:3 energy=-10.22387881 stress="9.62565537393716e-05 -8.952820690166946e-05 0.00024887393488546905 -8.952820690166946e-05 -0.00022760911329358475 1.722656518743779e-06 0.00024887393488546905 1.722656518743779e-06 -3.283657951127182e-05" free_energy=-10.26140093 pbc="T T T"
Al 6.54688000 3.49654000 9.41519000 -0.12232200 -1.03544000 0.19844600
Al 6.76057000 5.77495000 8.89815000 0.02836500 1.14267500 -1.39625500
Al 6.04690000 5.19773000 11.27752000 -0.35544600 0.22112400 0.44572200
Al 4.55515000 6.58506000 7.38400000 0.60166800 -0.14145400 0.65389200
Al 4.94719000 7.32712000 10.04459000 -0.15887600 -0.18309200 0.10516900

,then a value of 7 in the index file should correspond to the Al in the 2nd structure in bold. Could you please clarify this.

Also a follow up question to this, how do we specify custom configurations for training using an index file? If I supply the index file obtained on training with sparse_method=cur_points and try to start a fresh fit but with sparse_method=INDEX_FILE and sparse_file=train.xyz.idx, I get the following error:

================================= End of report ================================

Started reading sparse indices from file train.xyz.idx
Finished reading sparse indices from file, 88 of them.

Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

Backtrace for this error:
#0  0x2abbb8867dfd in ???
#1  0x2abbb8867013 in ???
#2  0x2abbb98ad26f in ???
#3  0x42ac10 in ???
#4  0x4314c1 in ???
#5  0x41ffb2 in ???
#6  0x4057fc in ???
#7  0x4051ee in ???
#8  0x2abbb9899c04 in ???
#9  0x40521e in ???
#10  0xffffffffffffffff in ???

I am interested in specifying my own configurations because I am trying to develop a force field for a Pt dimer in a silica matrix. The problem is that my configurations only have 2 Pt atoms per structure while there are 36 Si atoms and 72 O atoms per structure. I am interested in modeling the dynamics of the Pt dimer in the silica matrix. Will it be ok to use sparse_method=cur_points? As far as I understand, the probability of choosing Pt configurations will be quite small compared to O and Si configurations given this setup if I use one of the predefined methods. And it might be better to specify the configurations by hand and choose a large number of Pt configurations. I will be grateful for any tips and pointers. Thank you.

files.zip

bernstei · 2021-12-22T22:37:59Z

bernstei
Dec 22, 2021
Collaborator

On Dec 22, 2021, at 5:30 PM, salmanak31 ***@***.******@***.***>> wrote: I get the fit potential along with an index file, attached here (train.xyz.idx). This is not the sparse point index file. I believe you need to add "print_sparse_index=filename" to the descriptor string (i.e. inside the gap={} section) to generate that file. I am interested in specifying my own configurations because I am trying to develop a force field for a Pt dimer in a silica matrix. The problem is that my configurations only have 2 Pt atoms per structure while there are 36 Si atoms and 72 O atoms per structure. I am interested in modeling the dynamics of the Pt dimer in the silica matrix. Will it be ok to use sparse_method=cur_points? As far as I understand, the probability of choosing Pt configurations will be quite small compared to O and Si configurations given this setup if I use one of the predefined methods. And it might be better to specify the configurations by hand and choose a large number of Pt configurations. I will be grateful for any tips and pointers. Thank you. The number of sparse points is interpreted as per Z-center (by default equal for all species, and you might need to turn off add_species and explicitly include a separate descriptor for each Z-center to vary that). As a result, the Pt will not be neglected.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse configurations in the index file and specifying custom sparse configurations #378

{{title}}

Replies: 1 comment

{{title}}

Select a reply

Sparse configurations in the index file and specifying custom sparse configurations #378

salmanak31 Dec 22, 2021

Replies: 1 comment

bernstei Dec 22, 2021 Collaborator

salmanak31
Dec 22, 2021

bernstei
Dec 22, 2021
Collaborator