Allow passing batch_size via CLI #1025

Muennighoff · 2024-07-01T21:14:33Z

I think it'd make sense to allow passing the batch_sizekwarg when using the cli argument mteb run ...? Some missing results are due to OOMs (#1014)

The text was updated successfully, but these errors were encountered:

isaac-chung · 2024-07-02T06:19:03Z

Yeah. And within the tasks / evaluators, batch_size could be passed along as a kwarg?

KennethEnevoldsen · 2024-07-02T07:38:18Z

That has been the approach so far. I would actually probably prefer having encode_kwargs as an argument for the MTEB(model, encode_kwargs={"batch_size": 16}). Batch size seem very specific. This also allow passing e.g. length normalization.

We can still add the batch_size argument to the CLI

Muennighoff · 2024-07-02T07:39:51Z

Yeah. And within the tasks / evaluators, batch_size could be passed along as a kwarg?

As @KennethEnevoldsen said you can already pass it in evaluation.run like e.g. here: https://github.com/ContextualAI/gritlm/blob/0cc9aeab83b90f2e22bcdd2b084d51507c624d95/evaluation/eval_mteb.py#L1206

Maybe it would help having that in the docs / one of the README examples so people know 🤔

KennethEnevoldsen · 2024-07-02T07:50:01Z

Yea. I was actually hoping to transition away from using the batch_size arguments (essentially it is a model argument, but it is passed on to the task). I also don't think it is consistently implemented for all tasks types.

KennethEnevoldsen · 2024-07-02T07:50:53Z

I am more than happy to implement the encode_kwargs if people think it is a good idea.

KennethEnevoldsen mentioned this issue Jul 2, 2024

fix: Added encode_kwargs as the input for encode arguments and added batch_size to CLI #1030

Merged

2 tasks

KennethEnevoldsen closed this as completed in #1030 Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing batch_size via CLI #1025

Allow passing batch_size via CLI #1025

Muennighoff commented Jul 1, 2024

isaac-chung commented Jul 2, 2024

KennethEnevoldsen commented Jul 2, 2024 •

edited

Loading

Muennighoff commented Jul 2, 2024

KennethEnevoldsen commented Jul 2, 2024

KennethEnevoldsen commented Jul 2, 2024

Allow passing batch_size via CLI #1025

Allow passing batch_size via CLI #1025

Comments

Muennighoff commented Jul 1, 2024

isaac-chung commented Jul 2, 2024

KennethEnevoldsen commented Jul 2, 2024 • edited Loading

Muennighoff commented Jul 2, 2024

KennethEnevoldsen commented Jul 2, 2024

KennethEnevoldsen commented Jul 2, 2024

KennethEnevoldsen commented Jul 2, 2024 •

edited

Loading