Skip to content

Commit

Permalink
Use real_batch_size
Browse files Browse the repository at this point in the history
  • Loading branch information
kdamaszk committed Dec 12, 2024
1 parent 6d428f0 commit c957f3b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion vllm/worker/hpu_enc_dec_model_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -336,7 +336,7 @@ def _prepare_encoder_model_input_tensors(
real_batch_size = len(seq_group_metadata_list)
batch_size_padded = self.bucketing_ctx.get_padded_batch_size(
real_batch_size, is_prompt)
batch_size_padding = batch_size_padded - len(encoder_seq_lens)
batch_size_padding = batch_size_padded - real_batch_size
if batch_size_padding > 0:
encoder_seq_lens.extend(encoder_seq_lens[0]
for _ in range(batch_size_padding))
Expand Down

0 comments on commit c957f3b

Please sign in to comment.