-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MPI running on gap_fitting #602
Comments
It looks like your calculation did not finish. Was it killed by the queuing system? (When you want to quote raw text on Github issues, you should use three backticks ` above and below the quoted text, so it's not parsed as markdown.) |
Hi, libAtoms::Hello World: 2023-07-05 15:24:01 Calls to system_timer will do nothing by default MPI hostnames :: arc-c009 ================================ Input parameters ============================== config_file = ======================================== ====================================== ============== Gaussian Approximation Potentials - Database fitting ============ Initial parsing of command line arguments finished.
|
In the latest version we don't require this two-step fitting. can I close this? |
Hi GAP developers,
I am currently trying MPI gap fitting on an HPC system. I successfully finished the installation part.
For using part, I followed the instruction as follows,
Use sparsify_only_no_fit=T to just create the sparseX files
Optionally rename them to something shorter (e.g. 1.input, 2.input etc.)
For more than one species use add_species=F and explicit input
Run with mpirun -np … (or srun for Slurm)
MPI can trip up on the ….xyz.idx file. Usually works if it's there from the start, so retry can help.
I simply add sparsify_only_no_fit=T at the end of my original code which is attached below.
gap_fit at_file=CN_train_dataset.xyz e0={C:-148.6822264562:N:-271.4749770587} gap={distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0:angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05:soap n_max=12 l_max=4 atom_sigma=0.5 zeta=4.0 cutoff=4.5 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=10000 delta=0.2 covariance_type=dot_product sparse_method=cur_points radial_decay=-0.5} default_sigma={0.001 0.01 0.05 0.0} config_type_sigma={Liquid:0.050:0.5:0.5:0.0:Liquid_Interface:0.050:0.5:0.5:0.0:Amorphous_Bulk:0.005:0.2:0.2:0.0:Amorphous_Surfaces:0.005:0.2:0.2:0.0:Surfaces:0.002:0.1:0.2:0.0:Dimer:0.002:0.1:0. :0.0:Fullerenes:0.002:0.1:0.2:0.0:Defects:0.001:0.01:0.05:0.0:Crystalline_Bulk:0.001:0.01:0.05:0.0:Nanotubes:0.001:0.01:0.05:0.0:Graphite:0.001:0.01:0.05:0.0:Diamond:0.001:0.01:0.05:0.0:Graphene:0.001:0.01:0.05:0.0:Graphite_Layer_Sep:0.001:0.01:0.05:0.0:Single_Atom:0.0001:0.001:0.05:0.0} energy_parameter_name=energy force_parameter_name=force sparse_jitter=1.0e-8 do_copy_at_file=F openmp_chunk_size=10000 gp_file=gap.xml sparsify_only_no_fit=T| tee out
Also, the output of this code is as follows:
libAtoms::Hello World: 2023-06-29 00:41:14
libAtoms::Hello World: git version https://github.com/libAtoms/QUIP,v0.9.12-dirty
libAtoms::Hello World: QUIP_ARCH linux_x86_64_gfortran_openmp
libAtoms::Hello World: compiled on Jan 15 2023 at 16:26:45
libAtoms::Hello World: OpenMP parallelisation with 1 threads
WARNING: libAtoms::Hello World: environment variable OMP_STACKSIZE not set explicitly. The default value - system and compiler dependent - may be too small for some applications.
libAtoms::Hello World: Random Seed = 2474568
libAtoms::Hello World: global verbosity = 0
Calls to system_timer will do nothing by default
================================ Input parameters ==============================
config_file =
atoms_filename = //MANDATORY//
at_file = CN_train_dataset.xyz
gap = "distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0:angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05:soap n_max=12 l_max=4 atom_sigma=0.5 zeta=4.0 cutoff=4.5 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=10000 delta=0.2 covariance_type=dot_product sparse_method=cur_points radial_decay=-0.5"
e0 = C:-148.6822264562:N:-271.4749770587
local_property0 = 0.0
e0_offset = 0.0
e0_method = isolated
default_kernel_regularisation = //MANDATORY//
default_sigma = "0.001 0.01 0.05 0.0"
default_kernel_regularisation_local_property = 0.001
default_local_property_sigma = 0.001
sparse_jitter = 1.0e-8
hessian_displacement = 1.0e-2
hessian_delta = 1.0e-2
baseline_param_filename = quip_params.xml
core_param_file = quip_params.xml
baseline_ip_args =
core_ip_args =
energy_parameter_name = energy
local_property_parameter_name = local_property
force_parameter_name = force
virial_parameter_name = virial
stress_parameter_name = stress
hessian_parameter_name = hessian
config_type_parameter_name = config_type
kernel_regularisation_parameter_name = sigma
sigma_parameter_name = sigma
force_mask_parameter_name = force_mask
parameter_name_prefix =
config_type_kernel_regularisation =
config_type_sigma = Liquid:0.050:0.5:0.5:0.0:Liquid_Interface:0.050:0.5:0.5:0.0:Amorphous_Bulk:0.005:0.2:0.2:0.0:Amorphous_Surfaces:0.005:0.2:0.2:0.0:Surfaces:0.002:0.1:0.2:0.0:Dimer:0.002:0.1:0.2:0.0:Fullerenes:0.002:0.1:0.2:0.0:Defects:0.001:0.01:0.05:0.0:Crystalline_Bulk:0.001:0.01:0.05:0.0:Nanotubes:0.001:0.01:0.05:0.0:Graphite:0.001:0.01:0.05:0.0:Diamond:0.001:0.01:0.05:0.0:Graphene:0.001:0.01:0.05:0.0:Graphite_Layer_Sep:0.001:0.01:0.05:0.0:Single_Atom:0.0001:0.001:0.05:0.0
kernel_regularisation_is_per_atom = T
sigma_per_atom = T
do_copy_atoms_file = T
do_copy_at_file = F
sparse_separate_file = T
sparse_use_actual_gpcov = F
gap_file = gap_new.xml
gp_file = gap.xml
verbosity = NORMAL
rnd_seed = -1
openmp_chunk_size = 10000
do_ip_timing = F
template_file = template.xyz
sparsify_only_no_fit = T
dryrun = F
condition_number_norm =
linear_system_dump_file =
mpi_blocksize_rows = 0
mpi_blocksize_cols = 100
mpi_print_all = F
======================================== ======================================
WARNING: sparsify_only_no_fit == T: force, virial, hessian, stress parameters are ignored.
============== Gaussian Approximation Potentials - Database fitting ============
Initial parsing of command line arguments finished.
Found 3 GAPs.
Descriptors have been parsed
XYZ file read
Old GAP: {distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0}
New GAP: {distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0 Z1=6 Z2=6}
New GAP: {distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0 Z1=6 Z2=7}
New GAP: {distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0 Z1=7 Z2=7}
Old GAP: {angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05}
New GAP: {angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=6 Z1=6 Z2=6}
New GAP: {angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=6 Z1=6 Z2=7}
New GAP: {angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=6 Z1=7 Z2=7}
New GAP: {angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=6 Z2=6}
New GAP: {angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=6 Z2=7}
New GAP: {angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=7 Z2=7}
Old GAP: {soap n_max=12 l_max=4 atom_sigma=0.5 zeta=4.0 cutoff=4.5 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=10000 delta=0.2 covariance_type=dot_product sparse_method=cur_points radial_decay=-0.5}
New GAP: {soap n_max=12 l_max=4 atom_sigma=0.5 zeta=4.0 cutoff=4.5 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=10000 delta=0.2 covariance_type=dot_product sparse_method=cur_points radial_decay=-0.5 n_species=2 Z=6 species_Z={6 7 }}
New GAP: {soap n_max=12 l_max=4 atom_sigma=0.5 zeta=4.0 cutoff=4.5 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=10000 delta=0.2 covariance_type=dot_product sparse_method=cur_points radial_decay=-0.5 n_species=2 Z=7 species_Z={6 7 }}
Sparse points and target errors per pre-defined types of configurations
Liquid 0.50000000000000003E-001 0.50000000000000000E+000 0.50000000000000000E+000 0.00000000000000000E+000
Liquid_Interface 0.50000000000000003E-001 0.50000000000000000E+000 0.50000000000000000E+000 0.00000000000000000E+000
Amorphous_Bulk 0.50000000000000001E-002 0.20000000000000001E+000 0.20000000000000001E+000 0.00000000000000000E+000
Amorphous_Surfaces 0.50000000000000001E-002 0.20000000000000001E+000 0.20000000000000001E+000 0.00000000000000000E+000
Surfaces 0.20000000000000000E-002 0.10000000000000001E+000 0.20000000000000001E+000 0.00000000000000000E+000
Dimer 0.20000000000000000E-002 0.10000000000000001E+000 0.20000000000000001E+000 0.00000000000000000E+000
Fullerenes 0.20000000000000000E-002 0.10000000000000001E+000 0.20000000000000001E+000 0.00000000000000000E+000
Defects 0.10000000000000000E-002 0.10000000000000000E-001 0.50000000000000003E-001 0.00000000000000000E+000
Crystalline_Bulk 0.10000000000000000E-002 0.10000000000000000E-001 0.50000000000000003E-001 0.00000000000000000E+000
Nanotubes 0.10000000000000000E-002 0.10000000000000000E-001 0.50000000000000003E-001 0.00000000000000000E+000
Graphite 0.10000000000000000E-002 0.10000000000000000E-001 0.50000000000000003E-001 0.00000000000000000E+000
Diamond 0.10000000000000000E-002 0.10000000000000000E-001 0.50000000000000003E-001 0.00000000000000000E+000
Graphene 0.10000000000000000E-002 0.10000000000000000E-001 0.50000000000000003E-001 0.00000000000000000E+000
Graphite_Layer_Sep 0.10000000000000000E-002 0.10000000000000000E-001 0.50000000000000003E-001 0.00000000000000000E+000
Single_Atom 0.10000000000000000E-003 0.10000000000000000E-002 0.50000000000000003E-001 0.00000000000000000E+000
default 0.10000000000000000E-002 0.10000000000000000E-001 0.50000000000000003E-001 0.00000000000000000E+000
Multispecies support added where requested
===================== Report on number of descriptors found ====================
Descriptor 1: distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0 Z1=6 Z2=6
Number of descriptors: 7382902
Number of partial derivatives of descriptors: 0
Descriptor 2: distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0 Z1=6 Z2=7
Number of descriptors: 44
Number of partial derivatives of descriptors: 0
Descriptor 3: distance_2b n_sparse=15 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=4.5 delta=2.0 Z1=7 Z2=7
Number of descriptors: 44
Number of partial derivatives of descriptors: 0
Descriptor 4: angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=6 Z1=6 Z2=6
Number of descriptors: 9039580
Number of partial derivatives of descriptors: 0
Descriptor 5: angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=6 Z1=6 Z2=7
Number of descriptors: 0
Number of partial derivatives of descriptors: 0
Descriptor 6: angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=6 Z1=7 Z2=7
Number of descriptors: 0
Number of partial derivatives of descriptors: 0
Descriptor 7: angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=6 Z2=6
Number of descriptors: 0
Number of partial derivatives of descriptors: 0
Descriptor 8: angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=6 Z2=7
Number of descriptors: 0
Number of partial derivatives of descriptors: 0
Descriptor 9: angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=7 Z2=7
Number of descriptors: 0
Number of partial derivatives of descriptors: 0
Descriptor 10: soap n_max=12 l_max=4 atom_sigma=0.5 zeta=4.0 cutoff=4.5 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=10000 delta=0.2 covariance_type=dot_product sparse_method=cur_points radial_decay=-0.5 n_species=2 Z=6 species_Z={6 7 }
Number of descriptors: 178167
Number of partial derivatives of descriptors: 0
Descriptor 11: soap n_max=12 l_max=4 atom_sigma=0.5 zeta=4.0 cutoff=4.5 cutoff_transition_width=1.0 central_weight=1.0 n_sparse=10000 delta=0.2 covariance_type=dot_product sparse_method=cur_points radial_decay=-0.5 n_species=2 Z=7 species_Z={6 7 }
Number of descriptors: 79
Number of partial derivatives of descriptors: 0
======================================== ======================================
========================= Memory Estimate (per process) ========================
Descriptors
Descriptor 1 :: x 1 7382902 memory 59 MB
Descriptor 1 :: xPrime 1 0 memory 0 B
Descriptor 2 :: x 1 44 memory 352 B
Descriptor 2 :: xPrime 1 0 memory 0 B
Descriptor 3 :: x 1 44 memory 352 B
Descriptor 3 :: xPrime 1 0 memory 0 B
Descriptor 4 :: x 3 9039580 memory 216 MB
Descriptor 4 :: xPrime 3 0 memory 0 B
Descriptor 5 :: x 3 0 memory 0 B
Descriptor 5 :: xPrime 3 0 memory 0 B
Descriptor 6 :: x 3 0 memory 0 B
Descriptor 6 :: xPrime 3 0 memory 0 B
Descriptor 7 :: x 3 0 memory 0 B
Descriptor 7 :: xPrime 3 0 memory 0 B
Descriptor 8 :: x 3 0 memory 0 B
Descriptor 8 :: xPrime 3 0 memory 0 B
Descriptor 9 :: x 3 0 memory 0 B
Descriptor 9 :: xPrime 3 0 memory 0 B
Descriptor 10 :: x 1501 178167 memory 2139 MB
Descriptor 10 :: xPrime 1501 0 memory 0 B
Descriptor 11 :: x 1501 79 memory 948 KB
Descriptor 11 :: xPrime 1501 0 memory 0 B
Subtotal 2416 MB
Covariances
yY 21245 2965 memory 503 MB * 2
yy 21245 21245 memory 3610 MB
A 21245 24210 memory 4114 MB * 2
Subtotal 12 GB
Peak1 2920 MB
Peak2 12 GB
PEAK 12 GB
Free system memory 373 GB
Total system memory 405 GB
======================================== ======================================
========== Report on number of target properties found in training XYZ: ========
Number of target energies (property name: energy) found: 2965
Number of target local_properties (property name: local_property) found: 0
Number of target forces (property name: //IGNORE//) found: 0
Number of target virials (property name: //IGNORE//) found: 0
Number of target Hessian eigenvalues (property name: //IGNORE//) found: 0
================================= End of report ================================
===== Report on per-configuration/per-atom sigma (error parameter) settings ====
Number of per-configuration setting of energy_sigma found: 0
Number of per-configuration setting of force_sigma found: 0
Number of per-configuration setting of virial_sigma found: 0
Number of per-configuration setting of hessian_sigma found: 0
Number of per-configuration setting of local_propery_sigma found:0
Number of per-atom setting of force_atom_sigma found: 0
Number of per-component setting of force_component_sigma found: 0
Number of per-component setting of virial_component_sigma found: 0
================================= End of report ================================
WARNING: gpCoordinates_sparsify: number of data points (0) less than the number of sparse points (200), number of sparse points changed to 0
WARNING: gpCoordinates_sparsify: affected descriptor : angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=6 Z1=6 Z2=7
WARNING: gpCoordinates_sparsify: number of data points (0) less than the number of sparse points (200), number of sparse points changed to 0
WARNING: gpCoordinates_sparsify: affected descriptor : angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=6 Z1=7 Z2=7
WARNING: gpCoordinates_sparsify: number of data points (0) less than the number of sparse points (200), number of sparse points changed to 0
WARNING: gpCoordinates_sparsify: affected descriptor : angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=6 Z2=6
WARNING: gpCoordinates_sparsify: number of data points (0) less than the number of sparse points (200), number of sparse points changed to 0
WARNING: gpCoordinates_sparsify: affected descriptor : angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=6 Z2=7
WARNING: gpCoordinates_sparsify: number of data points (0) less than the number of sparse points (200), number of sparse points changed to 0
WARNING: gpCoordinates_sparsify: affected descriptor : angle_3b n_sparse=200 theta_uniform=1.0 sparse_method=uniform covariance_type=ard_se cutoff=2.5 delta=0.05 Z=7 Z1=7 Z2=7
Started CUR decomposition
However, there was no *.xml file generated as expected and only a *.idx file was created.
it would be appreciated if someone could help out with this and provide more details on the MPI running instruction.
Thank you so much in advance.
The text was updated successfully, but these errors were encountered: