Skip to content

Commit

Permalink
fix intermediate_size for qwen model loader.
Browse files Browse the repository at this point in the history
  • Loading branch information
guocuimi committed Dec 27, 2023
1 parent 793e664 commit 358bfbb
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion src/models/huggingface/qwen.h
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,9 @@ class QWenMLPImpl : public torch::nn::Module {
GCHECK(act_ != nullptr);

const int64_t hidden_size = args.hidden_size();
const int64_t intermediate_size = args.intermediate_size();
// the intermediate size is half of the size from the config
// ref: https://huggingface.co/Qwen/Qwen-7B/blob/main/modeling_qwen.py#L562
const int64_t intermediate_size = args.intermediate_size() / 2;

// register the weight parameter
w1_w2_proj_ = register_module("gate_up_proj",
Expand Down

0 comments on commit 358bfbb

Please sign in to comment.