Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while dp train input.json #4126

Open
umang4002 opened this issue Sep 15, 2024 · 6 comments
Open

Error while dp train input.json #4126

umang4002 opened this issue Sep 15, 2024 · 6 comments
Labels

Comments

@umang4002
Copy link

Summary

I constructes folders as type.raw, type_map.raw, set.000 - [box.npy, force.npy, energy.npy, coord.npy] {
"model": {
"type_map": [
"W",
"Fe",
"Ni",
"Co"
],
"descriptor": {
"type": "se_e2_a",
"rcut": 6.0,
"rcut_smth": 0.5,
"sel": [
40,
40,
40,
40
],
"neuron": [
10,
20,
40
],
"resnet_dt": false,
"axis_neuron": 4,
"seed": 1,
"_comment": "that's all"
},
"fitting_net": {
"neuron": [
100,
100,
100
],
"resnet_dt": true,
"seed": 1,
"_comment": "that's all"
},
"_comment": "that's all"
},
"learning_rate": {
"type": "exp",
"decay_steps": 5000,
"start_lr": 0.001,
"stop_lr": 3.51e-08,
"_comment": "that's all"
},
"loss": {
"type": "ener",
"start_pref_e": 0.02,
"limit_pref_e": 1,
"start_pref_f": 1000,
"limit_pref_f": 1,
"start_pref_v": 0,
"limit_pref_v": 0,
"_comment": "that's all"
},
"training": {
"training_data": {
"systems": [
"train_data/set_1/",
"train_data/set_2/",
"train_data/set_3/"
],
"batch_size": "auto",
"_comment": "that's all"
},
"validation_data": {
"systems": [
"test_data/set_1/",
"test_data/set_2/",
"test_data/set_3/"
],
"batch_size": "auto",
"numb_btch": 1,
"_comment": "that's all"
},
"numb_steps": 100000,
"seed": 10,
"disp_file": "lcurve.out",
"disp_freq": 1000,
"save_freq": 10000
}
} This is the input.json I am using.

DeePMD-kit Version

DeePMD-kit v2.2.9

Backend and its version

Tensorflow 2.9.0

Python Version, CUDA Version, GCC Version, LAMMPS Version, etc

No response

Details

ARNING:tensorflow:From /home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
Traceback (most recent call last):
File "/home/user/anaconda3/envs/deepmd/bin/dp", line 10, in
sys.exit(main())
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/main.py", line 656, in main
deepmd_main(args)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/main.py", line 74, in main
train_dp(**dict_args)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 149, in train
jdata = update_sel(jdata)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 512, in update_sel
jdata_cpy["model"] = Model.update_sel(jdata, jdata["model"])
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/model/model.py", line 566, in update_sel
return cls.update_sel(global_jdata, local_jdata)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/model/model.py", line 723, in update_sel
local_jdata_cpy["descriptor"] = Descriptor.update_sel(
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/descriptor/descriptor.py", line 511, in update_sel
return cls.update_sel(global_jdata, local_jdata)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se.py", line 162, in update_sel
return update_one_sel(global_jdata, local_jdata_cpy, False)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 479, in update_one_sel
tmp_sel = get_sel(
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 440, in get_sel
_, max_nbor_size = get_nbor_stat(jdata, rcut, one_type=one_type)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 390, in get_nbor_stat
train_data = get_data(
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 323, in get_data
data = DeepmdDataSystem(
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/utils/data_system.py", line 100, in init
DeepmdData(
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/utils/data.py", line 76, in init
self.atom_type = self._load_type(root)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/utils/data.py", line 583, in _load_type
atom_type = (sys_path / "type.raw").load_txt(ndmin=1).astype(np.int32)
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/site-packages/deepmd_utils/utils/path.py", line 160, in load_txt
return np.loadtxt(str(self.path), **kwargs)
File "/home/user/.local/lib/python3.10/site-packages/numpy/lib/npyio.py", line 1373, in loadtxt
arr = _read(fname, dtype=dtype, comment=comment, delimiter=delimiter,
File "/home/user/.local/lib/python3.10/site-packages/numpy/lib/npyio.py", line 1016, in _read
arr = _load_from_filelike(
File "/home/user/anaconda3/envs/deepmd/lib/python3.10/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x93 in position 0: invalid start byte

@njzjz
Copy link
Member

njzjz commented Sep 16, 2024

Did you generate the file using a different encoding other than UTF-8?

@umang4002
Copy link
Author

Does the format of the coord.npy file as 73.57 and 7.357+e1 is creating this error? I am using dump files from lammps to create coord.npy, force.npy, box.npy and energy.npy for correspondin frame.

@umang4002
Copy link
Author

Using dp data the coord generated file contains the format 7.357+e1 but manually creating the same files the format was the former one. Can this be the source of error?

@umang4002
Copy link
Author

And if my dump file is in the format id Type x y z fx fy fz.
Then using the command

dsys = dpdata.System("/content/dump.w_ni_fe_1900.lammpstrj", fmt="lammps/dump")
dsys.to("deepmd/npy", "deepmd_data", set_size=dsys.get_nframes())

is only creating the coord.npy, type.raw type_map.raw and box.npy.

In spite of having fx fy fz in the dump file, the above command is not able to account for the forces to create force.npy file.

To solve this issue, I have to extract the corresponding data manually, which results in the error. The shape of data is the same after using update and manually extracting data from frames.

I think the error is because https://github.com/deepmodeling/dpdata/tree/master/dpdata/lammps/dump.py does not have any code snippet for force when using lammps/dump format.

@umang4002
Copy link
Author

This is the data format I am using for training

energy = [-245065.86684099, -245043.93657998, -245078.53004939,
-245036.34437463, -245108.69610125, -245173.16775741,
-245215.20799936, -245135.88491991, -245208.40294702,........,]

force =
array([[ 0.722155 , 1.66346 , 0.182122 , ..., -0.116272 , 0.485847 ,
-1.33518 ],
[-0.83064 , -1.31154 , 0.387978 , ..., 6.23634 , 6.10007 ,
-1.11442 ],
[-0.142134 , 1.41269 , -2.4242 , ..., -4.30655 , 0.709387 ,
-1.23845 ],
...,
[ 0.0186577, 0.150581 , -0.727243 , ..., -0.502336 , -0.660103 ,
-1.29336 ],
[ 1.28235 , -0.375996 , 0.200156 , ..., 1.17807 , 1.61571 ,
0.646577 ],
[ 1.18955 , 1.91557 , 0.920073 , ..., -1.23146 , 1.72745 ,
3.02438 ]])

coord = ([[69.1717 , 59.716 , 4.91938, ..., 57.8245 , 48.3722 , 14.392 ],
[69.5761 , 60.1495 , 74.4595 , ..., 57.8314 , 48.1864 , 14.312 ],
[69.5075 , 60.0246 , 5.08948, ..., 57.8827 , 48.7315 , 14.1056 ],
...,
[68.4911 , 59.1742 , 73.6108 , ..., 56.6714 , 46.084 , 12.5388 ],
[68.1387 , 59.1014 , 73.3706 , ..., 56.4182 , 46.2359 , 12.5394 ],
[68.2556 , 58.8982 , 73.3789 , ..., 56.6145 , 45.388 , 12.4219 ]])

box = array([[69.60792272, 0. , 0. , 0. , 69.60792272,
0. , 0. , 0. , 69.60792272],...............................]])

@umang4002
Copy link
Author

umang4002 commented Sep 17, 2024

However I ran another program with different data and this ran.

This is the data which is running properly

energy = [-245065.86684099, -245043.93657998, -245078.53004939,
-245036.34437463, -245108.69610125, -245173.16775741,
-245215.20799936, -245135.88491991, -245208.40294702,........,]

force =
array([[ 3.6391 , 2.20297 , 3.04376 , ..., 0.165536 , 0.121503 ,
-0.939756 ],
[-0.955997 , -1.82867 , -0.320071 , ..., 5.49932 , -0.213522 ,
2.20919 ],
[ 1.86836 , 2.75882 , -0.457042 , ..., 4.37476 , -1.67104 ,
-0.68138 ],
...,
[-0.111075 , -0.0626644, -0.277481 , ..., -0.744622 , 0.755648 ,
-0.780645 ],
[-0.399858 , -1.18326 , -1.75817 , ..., -1.61044 , -0.189367 ,
1.03649 ],
[-0.48448 , -3.57668 , -0.724496 , ..., 4.93316 , 1.73451 ,
-0.469619 ]])

coord = array([[6.70427000e-02, 2.46912000e-01, 2.05648000e-02, ...,
6.99589000e+01, 6.99504000e+01, 6.97164000e+01],
[2.63638545e-01, 4.02347945e-01, 7.08109239e+01, ...,
7.02872239e+01, 7.04596239e+01, 6.95565239e+01],
[2.59688357e-01, 6.82876357e-01, 7.07703334e+01, ...,
7.02759334e+01, 6.91589334e+01, 7.01310334e+01],
...,
[6.93155176e+01, 6.87564176e+01, 6.91942176e+01, ...,
6.66071176e+01, 6.81565176e+01, 7.05844176e+01],
[6.93936865e+01, 6.92003865e+01, 6.94609865e+01, ...,
6.68730865e+01, 6.78987865e+01, 7.03048865e+01],
[6.92384739e+01, 6.84930739e+01, 6.88023739e+01, ...,
6.60239739e+01, 6.76421739e+01, 3.05772869e-01]])

box = array([[71.375 , 0. , 0. , 0. , 71.375 ,
0. , 0. , 0. , 71.375 ],,...............................]])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants