Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TYPO] #3635

Closed
wants to merge 271 commits into from
Closed

[TYPO] #3635

wants to merge 271 commits into from

Conversation

njzjz
Copy link
Member

@njzjz njzjz commented Apr 2, 2024

No description provided.

njzjz and others added 30 commits January 28, 2024 16:38
Set the default `save_ckpt` to `model.ckpt` as the prefix. When saving
checkpoints, `model.ckpt-100.pt` will be saved, and `model.ckpt.pt` will
be symlinked to `model.ckpt-100.pt`. A `checkpoint` file will be
dedicated to record `model.ckpt-100.pt`.

This keeps the same behavior as the TF backend. One can do the below
using the PT backend just like the TF backend:

```sh
dp --pt train input.json
# one can cancel the training before it finishes
dp --pt freeze
```

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
…eling#3195)

Fix
https://github.com/deepmodeling/deepmd-kit/security/code-scanning/2096

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
per discussion.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>
```
- source
  - tests
     - common
     - tf
     - pt
```

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Fix deepmodeling#3121.

There are TODOs:
(1) PyTorch-backend specific features and arguments;
(2) Python interface installation. Currently, the TensorFlow backend is
always installed, and I am considering rewriting the logic;
(3) Unsupported features - write docs when implemented.

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
The default one from PyPI is for CU12.

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
…pmodeling#3201)

If so, throw the following error:
```
-- PyTorch CXX11 ABI: 0
CMake Error at CMakeLists.txt:162 (message):
  PyTorch CXX11 ABI mismatch TensorFlow: 0 != 1
```

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
…deling#3200)

Fix deepmodeling#3120.

One can disable building the TensorFlow backend during `pip install` by
setting `DP_ENABLE_TENSORFLOW=0`.

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
…g net. (deepmodeling#3199)

- add dp model format (backend independent definition) for the fitting
- refactor torch support, compatible with dp model format
- fix mlp issue: the idt should only be used when a skip connection is
available.
- add tools `to_numpy_array` and `to_torch_tensor`.

---------

Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
This PR fixes GPU UTs;
Delete the PREPROCESS_DEVICE in torch data preprocess and use training
DEVICE instead, which will be removed after the dataset is refomated.

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: Han Wang <92130845+wanghan-iapcm@users.noreply.github.com>
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Today [GitHub introduced the new M1
runners](https://github.blog/changelog/2024-01-30-github-actions-introducing-the-new-m1-macos-runner-available-to-open-source/),
making it possible to build macos-arm64 wheels without cross-building.

Remove old hacked codes for cross-building.
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Migrated from this
[PR](dptech-corp/deepmd-pytorch#174). This is to
reimplement the PairTab Model in Pytorch.

Notes:

1. Different from the tensorflow version, the pytorch version abstracts
away all the post energy conversion operations (force, virial).
2. Added extrapolation when `rcut` > `rmax`. The pytorch version
overwrite energy beyond extrapolation endpoint to `0`. These features
are not available in the tensorflow version. The extrapolation uses a
cubic spline form, the 1st order derivation for the starting point is
estimated using the last two rows in the user defined table. See example
below:


![img_v3_027k_b50c690d-dc2d-4803-bd2c-2e73aa3c73fg](https://github.com/deepmodeling/deepmd-kit/assets/137014849/f3efa4d3-795e-4ff8-acdc-642227f0e19c)


![img_v3_027k_8de38597-ef4e-4e5b-989e-dbd13cc93fag](https://github.com/deepmodeling/deepmd-kit/assets/137014849/493da26d-f01d-4dd0-8520-ea2d84e7b548)


![img_v3_027k_f8268564-3f5d-49e6-91d6-169a61d9347g](https://github.com/deepmodeling/deepmd-kit/assets/137014849/b8ad4d4d-a4a4-40f0-94d1-810006e7175b)


![img_v3_027k_3966ef67-dd5e-4f48-992e-c2763311451g](https://github.com/deepmodeling/deepmd-kit/assets/137014849/27f31e79-13c8-4ce8-9911-b4cc0ac8188c)

---------

Co-authored-by: Anyang Peng <aisi_ap@Anyangs-Laptop.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
…ng#3208)

Features:
- abstract base classes for atomic model, fitting and descriptor.
- dp model format for atomic models
- dp model format for models.
- torch support for atomic model format. 
- torch support `fparam` and `aparam`.

This pr also introduces the following updates:
- support region and nlist in numpy code.
- class decorator like `fitting_check_output` gives human readable class
names.
- support int types in precision dict.
- fix descriptor interfaces.
- refactor torch atomic model impl. introduces dirty hacks to be fixed. 
- provide `format_nlist` that format the nlist in forward_lower method. 

Known limitations:
- torch atomic model has dirty hacks
- interfaces for descriptor, fitting and model statistics was not
considered, should be fixed in future PRs.

Will be fixed
-  [x] dp model module path is a mess to be refactorized.
-  [x] nlist consistency should be checked. if not format nlist.
-  [x] doc strings.
-  [x] `fparam` and `aparam` support.

---------

Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Fix deepmodeling#3214.

In the gmx patch file, `${TENSORFLOW_ROOT}` is used other than
`${TensorFlow_LIBRARY_PATH}$` or `${TENSORFLOW_INCLUDE_DIRS}`, so the
fastest workaround is to set `${TENSORFLOW_ROOT}`.


https://github.com/deepmodeling/deepmd-kit/blob/eb9b2efedf4efc946894800a0d7abf5056f4bb7a/source/gmx/patches/2020.2/CMakeLists.txt.patch.in#L14-L18

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Fix deepmodeling#3214.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Fix deepmodeling#3045. All memory leaks have been fixed!

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
This PR is to provide backend independent implementation of PairTabModel
in `numpy`. Also the cross framework `serialization` and
`deserialization` are added.

---------

Co-authored-by: Anyang Peng <aisi_ap@Anyangs-Laptop.local>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
add shape hit to doc string.

Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
…ling#3226)

To be consistent with TF, as discussed in
deepmodeling#3213 (comment).
Old PT models are expected to be incompatible.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
We can change the format of the global logger in the future if the
additional information is helpful (e.g., time, path, etc).

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Avoid installing tensorflow as build requires when
`DP_ENABLE_TENSORFLOW` is `0`.

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
- mv all dpmodel related UTs to a separate directory
- split the large test_model_format_utils.py as tests for different
modules.

---------

Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
wanghan-iapcm and others added 14 commits March 27, 2024 12:50
Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
deepmodeling#3578
deepmodeling#3579

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
deepmodeling#3475

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Only conduct file renaming in this PR.

---------

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
The wrong type hint appeared in
f5c67af, very old...

Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
…ng#3626)

- also add missing UT for execute_all

---------

Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
- add UT for it.
- at the moment, only energy is supported in the `base_atomic_model`.
handling of multiple output stat will be implemented in a future PR.

---------

Co-authored-by: Han Wang <wang_han@iapcm.ac.cn>
Signed-off-by: Jinzhe Zeng <jinzhe.zeng@rutgers.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants