Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
forest1040 committed Dec 16, 2024
0 parents commit e11ff40
Show file tree
Hide file tree
Showing 5 changed files with 302 additions and 0 deletions.
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024 TShiotaSS

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
137 changes: 137 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# MACE_Osaka24 models
This repository provides the model and training scripts for a multi-domain universal machine learning interatomic potentials (MLIPs), the MACE-Osaka24 models, capable of accurately describing both crystalline and molecular domains.

The MACE-Osaka24 model is a universal MLIP trained on datasets of both crystals and molecules, which were generated using a dataset integration technique called "Total Energy Alignment" that combines first-principles calculations under various conditions.

Its architecture is based on the first-generation MACE model. To use the models please install the [MACE code](https://github.com/ACEsuit/mace).

## Models

The first generation of models are available in the [MACE_Osaka24](https://github.com/TShiotaSS/mace_osaka24/releases/tag/v0.0.1).

If you use the models please cite

```bib
@article{batatia2023foundation,
title={A foundation model for atomistic materials chemistry},
author={Ilyes Batatia and Philipp Benner and Yuan Chiang and Alin M. Elena and Dávid P. Kovács and Janosh Riebesell and Xavier R. Advincula and Mark Asta and William J. Baldwin and Noam Bernstein and Arghya Bhowmik and Samuel M. Blau and Vlad Cărare and James P. Darby and Sandip De and Flaviano Della Pia and Volker L. Deringer and Rokas Elijošius and Zakariya El-Machachi and Edvin Fako and Andrea C. Ferrari and Annalena Genreith-Schriever and Janine George and Rhys E. A. Goodall and Clare P. Grey and Shuang Han and Will Handley and Hendrik H. Heenen and Kersti Hermansson and Christian Holm and Jad Jaafar and Stephan Hofmann and Konstantin S. Jakob and Hyunwook Jung and Venkat Kapil and Aaron D. Kaplan and Nima Karimitari and Namu Kroupa and Jolla Kullgren and Matthew C. Kuner and Domantas Kuryla and Guoda Liepuoniute and Johannes T. Margraf and Ioan-Bogdan Magdău and Angelos Michaelides and J. Harry Moore and Aakash A. Naik and Samuel P. Niblett and Sam Walton Norwood and Niamh O'Neill and Christoph Ortner and Kristin A. Persson and Karsten Reuter and Andrew S. Rosen and Lars L. Schaaf and Christoph Schran and Eric Sivonxay and Tamás K. Stenczel and Viktor Svahn and Christopher Sutton and Cas van der Oord and Eszter Varga-Umbrich and Tejs Vegge and Martin Vondrák and Yangshuai Wang and William C. Witt and Fabian Zills and Gábor Csányi},
year={2023},
eprint={2401.00096},
archivePrefix={arXiv},
primaryClass={physics.chem-ph}
}
@article{deng2023chgnet,
title={CHGNet: Pretrained universal neural network potential for charge-informed atomistic modeling},
author={Bowen Deng and Peichen Zhong and KyuJung Jun and Janosh Riebesell and Kevin Han and Christopher J. Bartel and Gerbrand Ceder},
year={2023},
eprint={2302.14231},
archivePrefix={arXiv},
primaryClass={cond-mat.mtrl-sci}
}
@inproceedings{NEURIPS2022_4a36c3c5,
author = {Batatia, Ilyes and Kovacs, David P and Simm, Gregor and Ortner, Christoph and Csanyi, Gabor},
booktitle = {Advances in Neural Information Processing Systems},
editor = {S. Koyejo and S. Mohamed and A. Agarwal and D. Belgrave and K. Cho and A. Oh},
pages = {11423--11436},
publisher = {Curran Associates, Inc.},
title = {MACE: Higher Order Equivariant Message Passing Neural Networks for Fast and Accurate Force Fields},
url = {https://proceedings.neurips.cc/paper_files/paper/2022/file/4a36c3c51af11ed9f34615b81edb5bbc-Paper-Conference.pdf},
volume = {35},
year = {2022}
}
```

## Training scripts

We provide training scripts for the models in this repository. The latest training command line is found in [`mace_osaka24/mace-osaka24-large.sh`](mace_osaka24/mace-osaka24-large.sh).

## Training data

The integrated inorganic–organic domain dataset used to train the models—composed of the inorganic MPtrj dataset and the organic SPICE, QMug, water clusters, and Tripeptides (OFF23) datasets—is available at [figshare](). If you use any of these datasets, please cite the following paper.

```bib
@article{deng2023chgnet,
title={CHGNet: Pretrained universal neural network potential for charge-informed atomistic modeling},
author={Bowen Deng and Peichen Zhong and KyuJung Jun and Janosh Riebesell and Kevin Han and Christopher J. Bartel and Gerbrand Ceder},
year={2023},
eprint={2302.14231},
archivePrefix={arXiv},
primaryClass={cond-mat.mtrl-sci}
}
@misc{kovacs2023maceoff23,
title={MACE-OFF23: Transferable Machine Learning Force Fields for Organic Molecules},
author={Dávid Péter Kovács and J. Harry Moore and Nicholas J. Browning and Ilyes Batatia and Joshua T. Horton and Venkat Kapil and William C. Witt and Ioan-Bogdan Magdău and Daniel J. Cole and Gábor Csányi},
year={2023},
eprint={2312.15211},
archivePrefix={arXiv},
}
@article{eastman2023spice,
title={Spice, a dataset of drug-like molecules and peptides for training machine learning potentials},
author={Eastman, Peter and Behara, Pavan Kumar and Dotson, David L and Galvelis, Raimondas and Herr, John E and Horton, Josh T and Mao, Yuezhi and Chodera, John D and Pritchard, Benjamin P and Wang, Yuanqing and others},
journal={Scientific Data},
volume={10},
number={1},
pages={11},
year={2023},
publisher={Nature Publishing Group UK London}
}
@article{donchev2021quantum,
title={Quantum chemical benchmark databases of gold-standard dimer interaction energies},
author={Donchev, Alexander G and Taube, Andrew G and Decolvenaere, Elizabeth and Hargus, Cory and McGibbon, Robert T and Law, Ka-Hei and Gregersen, Brent A and Li, Je-Luen and Palmo, Kim and Siva, Karthik and others},
journal={Scientific data},
volume={8},
number={1},
pages={55},
year={2021},
publisher={Nature Publishing Group UK London}
}
@article{isert2022qmugs,
title={QMugs, quantum mechanical properties of drug-like molecules},
author={Isert, Clemens and Atz, Kenneth and Jim{\'e}nez-Luna, Jos{\'e} and Schneider, Gisbert},
journal={Scientific Data},
volume={9},
number={1},
pages={273},
year={2022},
publisher={Nature Publishing Group UK London}
}
```

## Example

In this example, the energy of a silicon crystal and acetic acid is calculated using universal multi-domain MLIP MACE-Osaka24 and Atomic Simulation Environment (ASE).

```python
from ase.build import bulk
from ase.build import molecule
from mace.calculators import MACECalculator

si = bulk('Si', 'diamond', a=5.43)
calculator = MACECalculator(model_path='/path-to-mace-osaka24/mace-osaka24-large.model', device='cpu')
si.calc = calculator

energy_si = si.get_potential_energy()
print("Single-point energy of diamond Si:", energy_si)

acid = molecule('CH3COOH')
calculator = MACECalculator(model_path='/path-to-mace-osaka24/mace-osaka24-large.model', device='cpu')
acid.calc = calculator

energy_acid = acid.get_potential_energy()
print("Single-point energy of acetic acid:", energy_acid)
```

## Contributors
This project was developed by:

- Tomoya Shiota (@TShiotaSS)
- Kenji Ishihara (@kenji-ishihara-os)
- Toshio Mori (@forest1040)
- Wataru Mizukami (@wmizukami)
48 changes: 48 additions & 0 deletions mace_osaka24/mace-osaka24-large.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
python3 /opt/src/new_mace_0729/mace/mace/cli/run_train.py \
--name="${MODEL_NAME}" \
--train_file="/dataset/train" \
--valid_file="/dataset/val" \
--test_file="/dataset/test" \
--statistics_file="/dataset/statistics.json" \
--loss='universal' \
--energy_weight=1 \
--forces_weight=10 \
--compute_stress=True \
--stress_weight=100 \
--stress_key='stress' \
--eval_interval=1 \
--error_table='PerAtomMAE' \
--model="ScaleShiftMACE" \
--interaction_first="RealAgnosticResidualInteractionBlock" \
--interaction="RealAgnosticResidualInteractionBlock" \
--num_interactions=2 \
--correlation=3 \
--max_ell=3 \
--r_max=4.5 \
--max_L=2 \
--num_channels=128 \
--num_radial_basis=10 \
--MLP_irreps="16x0e" \
--scaling='rms_forces_scaling' \
--num_workers=64 \
--lr=0.005 \
--weight_decay=1e-8 \
--ema \
--ema_decay=0.995 \
--scheduler_patience=5 \
--batch_size=16 \
--valid_batch_size=32 \
--max_num_epochs=200 \
--patience=50 \
--amsgrad \
--device=cuda \
--seed=1 \
--clip_grad=100 \
--keep_checkpoints \
--save_cpu \
--restart_latest \
--log_dir="/mnt/logs/${MODEL_NAME}" \
--model_dir="/mnt/models/${MODEL_NAME}" \
--checkpoints_dir="/mnt/checkpoints/${MODEL_NAME}" \
--results_dir="/mnt/results/${MODEL_NAME}" \
--distributed >> ${RESULT_FILE}
48 changes: 48 additions & 0 deletions mace_osaka24/mace-osaka24-medium.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
python3 /opt/src/new_mace_0729/mace/mace/cli/run_train.py \
--name="${MODEL_NAME}" \
--train_file="/dataset/train" \
--valid_file="/dataset/val" \
--test_file="/dataset/test" \
--statistics_file="/dataset/statistics.json" \
--loss='universal' \
--energy_weight=1 \
--forces_weight=10 \
--compute_stress=True \
--stress_weight=100 \
--stress_key='stress' \
--eval_interval=1 \
--error_table='PerAtomMAE' \
--model="ScaleShiftMACE" \
--interaction_first="RealAgnosticResidualInteractionBlock" \
--interaction="RealAgnosticResidualInteractionBlock" \
--num_interactions=2 \
--correlation=3 \
--max_ell=3 \
--r_max=4.5 \
--max_L=1 \
--num_channels=128 \
--num_radial_basis=10 \
--MLP_irreps="16x0e" \
--scaling='rms_forces_scaling' \
--num_workers=64 \
--lr=0.005 \
--weight_decay=1e-8 \
--ema \
--ema_decay=0.995 \
--scheduler_patience=5 \
--batch_size=16 \
--valid_batch_size=32 \
--max_num_epochs=200 \
--patience=50 \
--amsgrad \
--device=cuda \
--seed=1 \
--clip_grad=100 \
--keep_checkpoints \
--save_cpu \
--restart_latest \
--log_dir="/mnt/logs/${MODEL_NAME}" \
--model_dir="/mnt/models/${MODEL_NAME}" \
--checkpoints_dir="/mnt/checkpoints/${MODEL_NAME}" \
--results_dir="/mnt/results/${MODEL_NAME}" \
--distributed >> ${RESULT_FILE}
48 changes: 48 additions & 0 deletions mace_osaka24/mace-osaka24-small.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
python3 /opt/src/new_mace_0729/mace/mace/cli/run_train.py \
--name="${MODEL_NAME}" \
--train_file="/dataset/train" \
--valid_file="/dataset/val" \
--test_file="/dataset/test" \
--statistics_file="/dataset/statistics.json" \
--loss='universal' \
--energy_weight=1 \
--forces_weight=10 \
--compute_stress=True \
--stress_weight=100 \
--stress_key='stress' \
--eval_interval=1 \
--error_table='PerAtomMAE' \
--model="ScaleShiftMACE" \
--interaction_first="RealAgnosticResidualInteractionBlock" \
--interaction="RealAgnosticResidualInteractionBlock" \
--num_interactions=2 \
--correlation=3 \
--max_ell=3 \
--r_max=4.5 \
--max_L=0 \
--num_channels=128 \
--num_radial_basis=10 \
--MLP_irreps="16x0e" \
--scaling='rms_forces_scaling' \
--num_workers=64 \
--lr=0.005 \
--weight_decay=1e-8 \
--ema \
--ema_decay=0.995 \
--scheduler_patience=5 \
--batch_size=16 \
--valid_batch_size=32 \
--max_num_epochs=200 \
--patience=50 \
--amsgrad \
--device=cuda \
--seed=1 \
--clip_grad=100 \
--keep_checkpoints \
--save_cpu \
--restart_latest \
--log_dir="/mnt/logs/${MODEL_NAME}" \
--model_dir="/mnt/models/${MODEL_NAME}" \
--checkpoints_dir="/mnt/checkpoints/${MODEL_NAME}" \
--results_dir="/mnt/results/${MODEL_NAME}" \
--distributed >> ${RESULT_FILE}

0 comments on commit e11ff40

Please sign in to comment.