Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ch_PP-OCRv4_rec_svtr_large.yml训练导出的模型,使用predict_rec.py预测宽度比较大的图片时出现(InvalidArgument) Broadcast dimension mismatch #12440

Open
LMR2018 opened this issue May 25, 2024 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@LMR2018
Copy link

LMR2018 commented May 25, 2024

使用ch_PP-OCRv4_rec_svtr_large.yml训练的OCR识别模型,训练正常,
使用python tools/eval.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec_svtr_large.yml也是正常的,
用python tools/export_model.py -c configs/rec/PP-OCRv4/ch_PP-OCRv4_rec_svtr_large.yml 也是能成功导出模型的
用export_model.py导出的模型,使用python tools/infer/predict_rec.py预测宽度不太长的单行图片也是能正常预测的,
但是predict_rec.py预测宽度比较大的图片时出现:(InvalidArgument) Broadcast dimension mismatch错误
即使是训练、验证用的宽度比较大的图片也是出现这个错误,这个怎么解决?
训练配置:image_shape: [3, 48, 320], max_text_length: &max_text_length 50

`[2024/05/25 14:48:36] ppocr INFO: Traceback (most recent call last):
File "tools/infer/predict_rec.py", line 728, in main
rec_res, _ = text_recognizer(img_list)
File "tools/infer/predict_rec.py", line 675, in call
self.predictor.run()
ValueError: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 2148, 192] and the shape of Y = [1, 960, 192]. Received [2148] in X is not equal to [960] in Y at i:1.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ../paddle/phi/kernels/funcs/common_shape.h:86)
[operator < elementwise_add > error]

[2024/05/25 14:48:36] ppocr INFO: (InvalidArgument) Broadcast dimension mismatch. Operands could not be broadcast together with the shape of X = [1, 2148, 192] and the shape of Y = [1, 960, 192]. Received [2148] in X is not equal to [960] in Y at i:1.
[Hint: Expected x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1 == true, but received x_dims_array[i] == y_dims_array[i] || x_dims_array[i] <= 1 || y_dims_array[i] <= 1:0 != true:1.] (at ../paddle/phi/kernels/funcs/common_shape.h:86)
[operator < elementwise_add > error]`

@LMR2018
Copy link
Author

LMR2018 commented May 25, 2024

0
Dingtalk_20240525170529

@GreatV GreatV added the bug Something isn't working label May 25, 2024
@GreatV GreatV self-assigned this May 25, 2024
@GreatV
Copy link
Collaborator

GreatV commented May 25, 2024

直接resize 成 [3, 48, 320] 可以infer吗

@LMR2018
Copy link
Author

LMR2018 commented May 25, 2024

直接resize 成 [3, 48, 320] 可以infer吗

一开始就进行直接resize成(320, 48),不行,后面官方又做了别的处理了,最后图片不是(320,48,3)
Dingtalk_20240525174942

Dingtalk_20240525175020

@LMR2018
Copy link
Author

LMR2018 commented May 25, 2024

还发现一个问题,上面28032、904118的原始图片出错,但是用过58587,579102,583271,359190的却能正常,所以在猜想不是宽度太长的问题,用ch_PP-OCRv4_rec.yml配置训练的模型就没有这个问题

@nissansz
Copy link

nissansz commented Jun 2, 2024

ch_PP-OCRv4_rec.yml

ch_PP-OCRv4_rec.yml 这个配置我怎么训练不了,batch size = 1都爆显存。
ch_PP-OCRv4_rec.yml 这个配置没问题,可以训练

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants