DPA-2.2.0 Q2 : RuntimeError: Unexpected result type <class 'dict'> in zero-shot #4033
Answered
by
BianTieyuan
BianTieyuan
asked this question in
Q&A
-
Hi developers,
(base) [polyucmp@localhost DPA-2-2024Q2]$ dp --pt change-bias OpenLAM_2.2.0_27heads_beta3.pt -s GST_GAP_22 --model-branch Domains_SemiCond
To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, DP_INTRA_OP_PARALLELISM_THREADS, and DP_INTER_OP_PARALLELISM_THREADS. See https://deepmd.rtfd.io/parallelism/ for more information.
[2024-07-30 11:17:30,839] DEEPMD INFO DeePMD version: 3.0.0b3
[2024-07-30 11:17:32,824] DEEPMD INFO Changing out bias for model Domains_SemiCond.
[2024-07-30 11:17:34,832] DEEPMD INFO Packing data for statistics from 89 systems
[2024-07-30 11:17:38,850] DEEPMD INFO If you encounter the error 'an illegal memory access was encountered', this may be due to a TensorFlow issue. To avoid this, set the environment variable DP_INFER_BATCH_SIZE to a smaller value than the last adjusted batch size. The environment variable DP_INFER_BATCH_SIZE controls the inference batch size (nframes * natoms).
[2024-07-30 11:17:39,318] DEEPMD INFO Adjust batch size from 1024 to 2048
[2024-07-30 11:17:39,960] DEEPMD INFO Adjust batch size from 2048 to 4096
[2024-07-30 11:17:42,621] DEEPMD INFO Adjust batch size from 4096 to 8192
[2024-07-30 11:17:44,142] DEEPMD INFO Adjust batch size from 8192 to 16384
[2024-07-30 11:17:53,832] DEEPMD INFO Adjust batch size from 16384 to 8192
Traceback (most recent call last):
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/bin/dp", line 10, in <module>
sys.exit(main())
^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/main.py", line 923, in main
deepmd_main(args)
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 346, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/entrypoints/main.py", line 575, in main
change_bias(FLAGS)
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/entrypoints/main.py", line 511, in change_bias
updated_model = training.model_change_out_bias(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/train/training.py", line 1277, in model_change_out_bias
_model.change_out_bias(
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/model/model/make_model.py", line 203, in change_out_bias
self.atomic_model.change_out_bias(
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/model/atomic_model/base_atomic_model.py", line 453, in change_out_bias
delta_bias, out_std = compute_output_stats(
^^^^^^^^^^^^^^^^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/utils/stat.py", line 282, in compute_output_stats
model_pred = _compute_model_predict(sampled, keys, model_forward)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/utils/stat.py", line 173, in _compute_model_predict
sample_predict = model_forward_auto_batch_size(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/utils/stat.py", line 165, in model_forward_auto_batch_size
return auto_batch_size.execute_all(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/utils/auto_batch_size.py", line 153, in execute_all
r_list = [concate_result(r) for r in zip(*results)]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/utils/auto_batch_size.py", line 153, in <listcomp>
r_list = [concate_result(r) for r in zip(*results)]
^^^^^^^^^^^^^^^^^
File "/run/media/polyucmp/hdd1/BIAN_Tieyuan/software/deepmd-v3.0.0b3/build/lib/python3.11/site-packages/deepmd/pt/utils/auto_batch_size.py", line 149, in concate_result
raise RuntimeError(f"Unexpected result type {type(r[0])}")
RuntimeError: Unexpected result type <class 'dict'> Does this version pf deepmd-kit have bugs? |
Beta Was this translation helpful? Give feedback.
Answered by
BianTieyuan
Jul 31, 2024
Replies: 1 comment 1 reply
-
I can reproduce it using a model under the example directory. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for your reply. I changed batch size using
export DP_INFER_BATCH_SIZE=8192
and error disappeared.