Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepseek-llm-7b-chat微调报错 #171

Open
lzh123415 opened this issue Jun 18, 2024 · 1 comment
Open

deepseek-llm-7b-chat微调报错 #171

lzh123415 opened this issue Jun 18, 2024 · 1 comment

Comments

@lzh123415
Copy link

transformers version: 4.41.2
Python version: 3.12
System: windows

    peft_config = LoraConfig(
        task_type=TaskType.CAUSAL_LM,
        target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
        inference_mode=False,  # 训练模式
        r=8,  # Lora 秩
        lora_alpha=32,  # Lora alaph,具体作用参见 Lora 原理
        lora_dropout=0.1  # Dropout 比例
    )

class ModifiedTrainer(Trainer):
def compute_loss(self, model, inputs, return_outputs=False):
# 7B
# print(model)
# print(inputs)
return model(
input_ids=inputs["input_ids"],
labels=inputs["labels"]
).loss

def save_model(self, output_dir=None, _internal_call=False):
    from transformers.trainer import TRAINING_ARGS_NAME

    os.makedirs(output_dir, exist_ok=True)
    torch.save(self.args, os.path.join(output_dir, TRAINING_ARGS_NAME))
    saved_params = {
        k: v.to("cuda:0") for k, v in self.model.named_parameters() if v.requires_grad
    }
    torch.save(saved_params, os.path.join(output_dir, "adapter_model.bin"))
    
    
    
    完整的报错信息如下:
    
    2024-06-18 11:58:33-train-INFO: 从dataset/huanhuan.json加载数据集成功

log_name: log.log
2024-06-18 11:58:33-train-INFO: 开始 LoRA 训练
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:08<00:00, 4.03s/it]
2024-06-18 11:58:41-train-INFO: 从deepseek-ai/deepseek-llm-7b-chat加载模型成功
2024-06-18 11:59:20-train-INFO: 加载 LoRA 参数成功
Found cached dataset generator (C:/Users/admin/.cache/huggingface/datasets/generator/default-d2f54e55ff33160c/0.0.0)
2024-06-18 11:59:20-train-INFO: 从dataset/huanhuan.json加载数据集成功
2024-06-18 11:59:20-train-INFO: 成功加载 Trainer
0%| | 0/1401 [00:00<?, ?it/s]Traceback (most recent call last):
File "D:\Pythonprojects\chat_test\deepseek_test2.py", line 376, in
main()
File "D:\Pythonprojects\chat_test\deepseek_test2.py", line 368, in main
trainer.train()
File "D:\Pythonprojects\chat_test\venv\Lib\site-packages\transformers\trainer.py", line 1885, in train
return inner_training_loop(
^^^^^^^^^^^^^^^^^^^^
File "D:\Pythonprojects\chat_test\venv\Lib\site-packages\transformers\trainer.py", line 2216, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Pythonprojects\chat_test\venv\Lib\site-packages\transformers\trainer.py", line 3238, in training_step
loss = self.compute_loss(model, inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Pythonprojects\chat_test\deepseek_test2.py", line 156, in compute_loss
return model(
^^^^^^
File "D:\Pythonprojects\chat_test\venv\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Pythonprojects\chat_test\venv\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Pythonprojects\chat_test\venv\Lib\site-packages\peft\peft_model.py", line 922, in forward
return self.base_model(
^^^^^^^^^^^^^^^^
return self.base_model(
return self.base_model(
^^^^^^^^^^^^^^^^
File "D:\Pythonprojects\chat_test\venv\Lib\site-packages\torch\nn\modules\module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Pythonprojects\chat_test\venv\Lib\site-packages\torch\nn\modules\module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: LlamaModel.forward() got an unexpected keyword argument 'labels'

@KMnO4-zx
Copy link
Contributor

这个代码可以在linux或者autodl平台跑通嘛?因为看到您的环境是windows,所以可能会存在一些不稳定原因,无法准备确定问题。
可以尝试使用和教程一样的环境~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants