Welcome to the repository of BAMBOO! This repository hosts the source code for creating a machine learning-based force field (MLFF) for molecular dynamics (MD) simulations of lithium battery electrolytes. Whether you're interested in simulating lithium battery electrolytes or other types of liquids, BAMBOO provides a robust and versatile solution.
Thank you for considering BAMBOO for your research needs. We are thrilled to be a part of your scientific journey and are eager to see how our project contributes to your outstanding results.
This section will guide you on how to obtain and set up BAMBOO on your local machine for development and testing purposes.
To get started with BAMBOO, please ensure that you meet the following requirements:
- LAMMPS: stable_2Aug2023_update3 (Tested branch.)
- CUDA: 12+
- Pytorch: 2.0+
Once you have satisfied the above prerequisites, you are ready to proceed to the installation steps.
To get started, clone the BAMBOO repository to your local machine using the following command:
git clone https://github.com/bytedance/bamboo.git
With this step, you get BAMBOO on your local system, ready for use.
To initialize the environment and retrieve the LAMMPS source code, follow these steps:
cd pair
bash ./init_compile.sh
cd lammps
bash ./build.sh
The build.sh script is pre-configured for the NVIDIA GeForce RTX 4090 GPU. If you are using a different GPU, you may need to adjust the ARCH variable within the script to match your specific hardware. Refer to the NVIDIA CUDA Toolkit documentation for details on selecting the correct architecture flags.
The Libtorch version is currently specified in the init_compile.sh script. If you require a different version of Libtorch, you will need to update this script accordingly.
To demonstrate the capabilities and usage of BAMBOO, we have included a small but self-contained dataset featuring key components used in electrolyte for lithium batteries. This dataset includes:
- Dimethyl carbonate (DMC)
- Ethylene carbonate (EC)
- Lithium ions (Li+)
- Hexafluorophosphate ions (PF6-)
To get the dataset, you need:
-
Visit the following links to download the datasets: Demo data
-
After downloading, copy the
train_data.pt
andval_data.pt
into thedata
directory of the project. Once the datasets are properly placed, you can proceed with the following examples.
As we focus on simulating an electrolyte composed of DMC, EC, and LiPF6, we also provide:
- Initial conformation file:
in.data
in folderbenchmark
, which contains the starting structure for MD simulations. - Input file for LAMMPS:
in.lammps
in folderbenchmark
, which is prepared to start simulations using LAMMPS.
These resources are designed to help users quickly set up BAMBOO and run simulations based on MLFF to explore the behavior of lithium battery electrolytes.
Follow these steps to train a MLFF using BAMBOO:
-
Navigate to the project directory
Replace
<path-to-your-installation>
with the actual path where you have installed BAMBOO, then execute the following command to move into that directory:cd <path-to-your-installation>
-
Train a model
Start the training process by running:
python3 -m train.train --config configs/train_config/config.json
This command uses a configuration file located at
configs/train_config/config.json
, where the paremeters can be changed as you need. After training, a new folder named after thejob_name
variable in your configuration file will be created inside the<path-to-your-installation>/train
directory. This folder will contain the training logs and checkpoint models saved as.pt
files.
To perform a MD simulation using a BAMBOO model, follow these steps:
-
Create a folder for MD simulation and prepare the necessary files
Navigate to your BAMBOO directory and make a new folder for MD simulations. Copy the
in.data
andin.lammps
files from<path-to-your-installation>/data
into this directory:cd <path-to-your-installation> mkdir simulation && cd simulation cp ../benchmark/* .
-
Configure the simulation settings
Modify the
benchmark.pt
inin.lammps
file to point to the path of.pt
file for the simulation. -
Run a MD simulation
Execute a MD simulation by LAMMPS:
lmp -k on g 1 -sf kk -in in.lammps -log log.lammps > out.log 2>&1
The
in.lammps
file can be configured for your simulation needs. The.pt
file from any MLFF generated from training, ensembling, or alignment, can be used to run the MD simulations.
To run ensemble and alignment processes, frames from MD trajectories are required. Here's a guide to generating these frames:
-
Navigate to the project directory
Execute the following command to move into that directory:
cd <path-to-your-installation>
-
Extract the frames from MD trajectories
Here is an example command to extract frames from MD trajectories:
python3 -m utils.load_traj --job_folder <path-to-your-simulation> --output_folder <path-to-save-frames> --mixture_name <your-mixture-name>
The mixture-name will be used in the alignment to instruct which system is aligned.
Averaging multiple replicate MLFF models into an ensembled one can help reduce variance and improve the accuracy of predictions. Follow these steps to ensemble several models trained from your dataset:
-
Navigate to the project directory
Execute the following command to move into that directory:
cd <path-to-your-installation>
-
Modify the config file
To ensemble your models, you need to modify the
config.json
file appropriately. This file should clearly define the paths to the models you intend to ensemble, the model based on which the changes of paremeters will be made, and the directories containing the MD frames used for ensembling. Here, we give an example ofconfig.json
.{ "job_name": "ensemble_bamboo_community", "training_data_path": "<path-to-your-installation>/data/train_data.pt", "validation_data_path": "<path-to-your-installation>/data/val_data.pt", "batch_size": 512, "models": ["<path-to-your-model>/<your-model1-name>.pt", "<path-to-your-model>/<your-model2-name>.pt", "<path-to-your-model>/<your-model3-name>.pt"], "frame_directories": ["<path-to-your-frames>"], "ensemble_model": "<path-to-your-model>/<your-ensemble-model>.pt", "validation_split_ratio": 0.1, "lr": 1e-6, "epochs": 50, "scheduler_gamma": 0.99, "validation_interval": 10, "energy_ratio": 0.3, "force_ratio": 1.0, "virial_ratio": 0.1, "bulk_energy_ratio": 0.01, "bulk_force_ratio": 3.0, "bulk_virial_ratio": 0.01, "max_frames_per_mixture": 960, "frame_validation_interval": 3 }
In this file, the
models
is a list containing all the paths of models you intend to ensemble. Theframe_direcories
is a list containing all the paths of MD frames used. Theensemble_model
is the path of the based-model, whose parameters will change. -
Ensemble the models
Start the ensemble process by running:
python3 -m train.ensemble --config configs/ensemble_config/config.json
After ensembling, a new folder named after the
job_name
variable in your configuration file will be created inside the<path-to-your-installation>/ensemble
directory. This folder will contain the training logs and checkpoint models saved as.pt
files.
Note: To create an ensemble model, you need at least three different models.
BAMBOO offers functionality to finetune the model's predictions by adjusting parameters such as pressure, which is referred to as the alignment process. For example, if you need to change the model's predicted pressure by dP = -2000 Pa, follow these specific steps:
-
Navigate to the project directory
Execute the following command to move into that directory:
cd <path-to-your-installation>
-
Modify the config file
To finetune your models by the alignment, you need to modify the
config.json
file appropriately. This file should clearly define the paths to the model you intend to finetune, and the directories containing the MD frames used for alignment. Here, we give an example ofconfig.json
.{ "job_name": "alignment_bamboo_community", "training_data_path": "<path-to-your-installation>/data/train_data.pt", "validation_data_path": "<path-to-your-installation>/data/val_data.pt", "model": "<path-to-your-model>/<your-alignment-model>.pt", "frame_directories": ["<path-to-your-frames>"], "mixture_names": ["<your-mixture-name>"], "delta_pressure": [-2000], "energy_ratio": 0.3, "force_ratio": 1.0, "virial_ratio": 0.1, "dipole_ratio": 3.0, "bulk_energy_ratio": 1e2, "bulk_force_ratio": 1e6, "bulk_virial_ratio": 3e3, "batch_size": 512, "epochs": 30, "frame_val_interval": 3, "max_frame_per_mixture": 30, "lr": 1e-12, "scheduler_gamma": 0.99 }
The
mixture_names
is a list that includes the names of the mixtures corresponding to the frames, which is set during generating frames. Thedelta_pressure
is a list that contains the values of dP for each mixture. -
Finetune the model by the alignment process
Start the alignment process by running:
python3 -m train.alignment --config configs/alignment_config/config.json
After alignment, a new folder named after the
job_name
variable in your configuration file will be created inside the<path-to-your-installation>/alignment
directory. This folder will contain the training logs and checkpoint models saved as.pt
files.
We have provided the model we trained that used for the data reported in our paper, which is located in the benchmark
folder and named benchmark.pt
. If you wish to reproduce the results mentioned in the paper, you can use this model.
We welcome contributions to BAMBOO! If you have suggestions or improvements, please refers to CONTRIBUTING.md
If you use BAMBOO in your research, please cite: BAMBOO: a predictive and transferable machine learning force field framework for liquid electrolyte development.
This project is licensed under the GNU General Public License, Version 2.