NASICON stability predictor

In our work of NASICON stability predictor, we used SIS+MLR to identify the best feature for predicting NASICON stabilities from the set of millions of candidate features. This repo hosts necessary data and codes for computing the best feature and reproducing the machine-learning parts of our work.

Compatibility map

To display the stabilities of NASICON compounds, we created an interactive compatibilty map in Stability.html (Fig. 2a).

The raw data (i.e. Ehull values) of this interactive map can be found in ./RawData/test_2D.csv and ./RawData/train_2D.csv.

Data files

All the raw data/processed datasets for machine learning are stored in the folder ./RawData.

Input data files

KeyFeatureNames.txt: Name of basic features used for SIS procedure (see details in SI) (will link to SI on publisher's website when available).
exp_comps.json: Composition of experimentally synthesized materials in Fig. 5b.

ML-related datasets

train.csv/test.csv: values of E_hull and all basic features of 80-20 split train/test data for model selection. The optimal 2D SIS features is selected from 1,999,000 SIS+MLR models based on these data.
train_2D.csv/test_2D.csv: values of E_hull and the optimal 2D SIS features of train/test data in train.csv/test.csv.
train_X_fold[1-5].dat/test_X_fold[1-5].dat: values of 2D SIS features of train/test data for the five-fold cross-validation to evaluate the final model.
train_Y_fold[1-5].dat/test_Y_fold[1-5].dat: 0/1 encodings of synthesizability of train/test data for the five-fold cross-validation to evaluate the final model.

Scripts for reproducing results in our paper

The scripts for reproducing the machine learning model, metrics and all figures in our manuscript can be found in ./Script folder.

Before running the jupyter notebooks, make sure you have all dependencies installed:

pip install -r requirements.txt

Run_preprocess_feature_transformation.ipynb: preprocess data by transforming basic features to 2D SIS features.
Run_Ranked_SVM.ipynb: train ranked SVM model to predict ranking of synthesizability (E_hull values).
Run_five_fold_CV.ipynb: train five-fold cross-validation to evaluate the effectiveness of a linear decision boundary to separate synthetically accessible/non-accessible NASICON compositions.

Figures

All high-resolution figures related to machine learning model in our manuscript can be found in folder Figures.

Citing

If you find this repo useful in your own projects, please consider citing our paper:

@article{ouyang2021synthetic,
  title={Synthetic accessibility and stability rules of NASICONs},
  author={Ouyang, Bin and Wang, Jingyang and He, Tanjin and Bartel, Christopher J and Huo, Haoyan and Wang, Yan and Lacivita, Valentina and Kim, Haegyeom and Ceder, Gerbrand},
  journal={arXiv preprint arXiv:2102.03627},
  year={2021}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NASICON stability predictor

Compatibility map

Data files

Input data files

ML-related datasets

Scripts for reproducing results in our paper

Figures

Citing

About

Releases 1

Packages

Contributors 3

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Figures		Figures
RawData		RawData
Script		Script
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
Stability.html		Stability.html
requirements.txt		requirements.txt

License

Jeff-oakley/NASICON_Predictor_Data

Folders and files

Latest commit

History

Repository files navigation

NASICON stability predictor

Compatibility map

Data files

Input data files

ML-related datasets

Scripts for reproducing results in our paper

Figures

Citing

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Contributors 3

Languages

Packages