Quick tutorial
Introduction
This section demonstrate SIMPLE-NN with tutorials.
Example files are in SIMPLE-NN/tutorials/
.
In this example, snapshots from 500K MD trajectory of
amorphous SiO2 (72 atoms) are used as training set.
To run SIMPLE-NN, type the following command:
python run.py
If you have installed mpi4py
, MPI parallelization provides an additional speed gain in preprocess (generate_features
and preprocess
in input.yaml
).
mpirun -np $numproc python run.py
where numproc
stands for the number of CPU processors.
Note
In this example, all paths in *_list
such as train_list
and valid_list
are written as relative path.
Therefore, you should copy data
directory to each example or change the paths properly after the first example Preprocess.
Preprocess
To preprocess the ab initio calculation result for training dataset of NNP,
you need three types of input file (input.yaml
, structure_list
, and params_XX
).
The example files except params_Si and params_O are introduced below.
Detail of params_Si and params_O can be found in params_XX section.
In this example, 70 symmetry functions consist of 8 radial symmetry functions per 2-body combination
and 18 angular symmetry functions per 3-body combination.
Input files introduced in this section can be found in
SIMPLE-NN/tutorials/Preprocess
.
# input.yaml
generate_features: True
preprocess: True
train_model: False
params:
Si: params_Si
O: params_O
preprocessing:
valid_rate: 0.1
calc_scale: True
calc_pca: True
# str_list
../ab_initio_output/OUTCAR_comp ::10
With this input file, SIMPLE-NN calculates feature vectors and its derivatives (generate_features
) and
generates training/validation dataset (preprocess
).
Sample VASP OUTCAR file (the file is compressed to reduce the file size) is in SIMPLE-NN/tutorials/ab_initio_output
.
In MD trajectory, snapshots are sampled only in the interval of 10 MD steps (20 fs).
Output files are provided in SIMPLE-NN/tutorials/Preprocess_answer
except for data
directory due to the large capacity.
data
directory contains the preprocessed ab initio calculation results as binary format named data1.pt
, data2.pt
, and so on.
If you want to see which data are saved in .pt
file, use the following command.
import torch
result = torch.load('data1.pt')
result
provides the information of input features as dictionary format.
Warning
We strongly recommend turning on the calc_pca
and calc_scale
options in the preprocess
. They significantly reduce the root-mean-square-error (RMSE) in the training
.
Training
To train the NNP with the preprocessed dataset, you need to prepare the input.yaml
, train_list
, valid_list
, scale_factor
, and pca
. The last two files highly improves the loss convergence and training quality.
# input.yaml
generate_features: False
preprocess: False
train_model: True
params:
Si: params_Si
O: params_O
neural_network:
nodes: 30-30
batch_size: 8
optimizer:
method: Adam
total_epoch: 100
learning_rate: 0.001
use_scale: True
use_pca: True
With this input file, SIMPLE-NN optimizes the neural network (train_model
).
The paths of training/validation dataset should be written in train_list
and valid_list
, respectively.
The 70-30-30-1 network is optimized by Adam optimizer with the 0.001 of learning rate and batch size of 8 during 1000 epochs.
The input feature vectors whose size is 70 are converted by scale_factor
, following PCA matrix transformation by pca
The execution log and energy, force, and stress root-mean-squared-error (RMSE) are stored in LOG
.
Input files introduced in this section can be found in SIMPLE-NN/tutorials/Training
.
Evaluation
To evaluate the training quality of neural network, test_list
and result of training (checkpoint.pth.tar
or potential_saved
) should be prepared.
test_list
contains the path of testset preprocessed as .pt
format. .pt
format data can be generated as described in preprocess. Note that you should set train_list
to test_list
with valid_rate
of 0.0. Then, SIMPLE-NN will write all paths of preprocessed data in test_list
.
# input.yaml
generate_features: True
preprocess: True
train_model: False
params:
Si: params_Si
O: params_O
preprocessing:
train_list: 'test_list'
valid_rate: 0.0
calc_scale: False
calc_pca: False
calc_atomic_weights: False
In this example, test_list
is made by concatenating train_list
and valid_list
in training for simplicity.
Put the name of result of training such as checkpoint_*.tar
for PyTorch checkpoint file or weights
for LAMMPS potential in continue
in input.yaml
.
# input.yaml
generate_features: False
preprocess: False
train_model: True
params:
Si: params_Si
O: params_O
neural_network:
train: False
test: True
continue: checkpoint_bestmodel.pth.tar
Input files introduced in this section can be found in
SIMPLE-NN/tutorials/Evaluation
.
Note
If you use LAMMPS potential (potential_saved
), you need to copy pca
and scale_factor
file and change the name of potential as potential_saved
.
After running SIMPLE-NN with the setting above,
output file named test_result
is generated.
The file is pickle format and you can open this file with python code of below
import torch
result = torch.load('test_result')
In the file, DFT energies/forces, NNP energies/forces are included.
We also provide the python code (correlation.py
) that makes parity plots from test_result
.
Molecular dynamics
Note
You have to compile your LAMMPS with pair_nn.cpp
, pair_nn.h
, and symmetry_function.h
to run molecular dynamics simulation.
To run MD simulation with LAMMPS, add the lines into the LAMMPS script file.
# lammps.in
units metal
pair_style nn
pair_coeff * * /path/to/potential_saved_bestmodel Si O
Warning
This pair_style requires the newton
setting to be on(default)
for pair interactions.
Input script for example of NVT MD simulation at 300 K are provided in SIMPLE-NN/tutorials/Molecular dynamics
.
Run LAMMPS via the following command.
/path/to/lammps/src/lmp_mpi < lammps.in
You also can run LAMMPS with mpirun
command if multi-core CPU is supported.
mpirun -np $numproc /path/to/lammps/src/lmp_mpi < lammps.in
Output files can be found in SIMPLE-NN/tutorials/Molecular_dynamics_answer
.