Unable to run model.generate() for MoD model #4063

Zkli-hub · 2024-06-04T08:49:55Z

Reminder

I have read the README and searched the existing issues.

System Info

I find that I cannot run the generate() function to inference inputs using the converted model, can you help me?

Here is the error:

Reproduction

from transformers import AutoTokenizer, LlamaForCausalLM
model = LlamaMoDForCausalLM.from_pretrained("LLaMA-Factory/saves/llama2-7b-mod/full/sft_full_0")
tokenizer = AutoTokenizer.from_pretrained("LLaMA-Factory/saves/llama2-7b-mod/full/sft_full_0")
prompt = "Hey, are you conscious? Can you talk to me?"
inputs = tokenizer(prompt, return_tensors="pt")
generate_ids = model.generate(inputs.input_ids)
tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]

Expected behavior

Run inference code using trained MoD model (sft based on llama2_mod)

Others

No response

The text was updated successfully, but these errors were encountered:

hiyouga · 2024-06-05T19:11:21Z

cc: @mlinmg

mlinmg · 2024-06-10T13:14:16Z

I'll try to reproduce it this days, please send you tranformers and mod package verison.
Also what model did you started with? you said llama2_mod but I can't find it on HF

PhoebusSi · 2024-07-02T07:08:52Z

在NPU(910上)，deepspeed和mod方法好像没有适配。
deepspeed+非mod的模型可以跑，不用deepspeed的mod小模型也可以跑（所以大点的模型因为不能用deepspeed而oom），
但是deepspeed+mod的大模型就不能跑，一直卡在第一个iteration，然后过会儿就超时pipe broken了。

@mlinmg @hiyouga 求解答~ 或者有其他适配了mod能work的并行化方式也好？

hiyouga added pending This problem is yet to be addressed labels Jun 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to run model.generate() for MoD model #4063

Unable to run model.generate() for MoD model #4063

Zkli-hub commented Jun 4, 2024

hiyouga commented Jun 5, 2024

mlinmg commented Jun 10, 2024

PhoebusSi commented Jul 2, 2024

Unable to run model.generate() for MoD model #4063

Unable to run model.generate() for MoD model #4063

Comments

Zkli-hub commented Jun 4, 2024

Reminder

System Info

Reproduction

Expected behavior

Others

hiyouga commented Jun 5, 2024

mlinmg commented Jun 10, 2024

PhoebusSi commented Jul 2, 2024