RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 华为910 命令行推理报错 #4622

apachemycat · 2024-06-30T03:09:37Z

Reminder

I have read the README and searched the existing issues.

System Info

llamafactory-cli env

llamafactory version: 0.8.3.dev0
Platform: Linux-4.19.36-vhulk1907.1.0.h1438.eulerosv2r8.aarch64-aarch64-with-glibc2.34
Python version: 3.9.9
PyTorch version: 2.1.0 (NPU)
Transformers version: 4.42.3
Datasets version: 2.20.0
Accelerate version: 0.31.0
PEFT version: 0.11.1
TRL version: 0.9.4
NPU type: Ascend910B
CANN version: 8.0.RC1

Reproduction

File "/usr/local/lib/python3.9/site-packages/transformers/models/llama/modeling_llama.py", line 326, in forward
query_states = self.q_proj(hidden_states)
File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, kwargs)
File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1527, in call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
**RuntimeError: "addmm_impl_cpu" not implemented for 'Half'

Expected behavior

正常

Others

Assistant: Exception in thread Thread-9:
Traceback (most recent call last):
File "/usr/lib64/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/usr/lib64/python3.9/threading.py", line 910, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib64/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.9/site-packages/transformers/generation/utils.py", line 1914, in generate
result = self._sample(
File "/usr/local/lib/python3.9/site-packages/transformers/generation/utils.py", line 2651, in _sample
outputs = self(
File "/usr/local/lib64/python3.9/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)

The text was updated successfully, but these errors were encountered:

MengqingCao · 2024-07-01T01:13:39Z

请问是否 ASCEND_RT_VISIBLE_DEVICES 指定了device 呢？这个报错看起来是跑在cpu上了，cpu貌似不支持fp16

参考 THUDM/ChatGLM3#177

github-actions bot added the pending This problem is yet to be addressed label Jun 30, 2024

hiyouga added the npu This problem is related to NPU devices label Jun 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 华为910 命令行推理报错 #4622

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 华为910 命令行推理报错 #4622

apachemycat commented Jun 30, 2024

MengqingCao commented Jul 1, 2024

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 华为910 命令行推理报错 #4622

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 华为910 命令行推理报错 #4622

Comments

apachemycat commented Jun 30, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

MengqingCao commented Jul 1, 2024