tokenizers.apply_chat_template with continue_final_message=True
with trailing spaces in input
#35433
Open
1 of 4 tasks
Labels
System Info
As title says,
tokenizers.apply_chat_template
fails with trailing spaces in input for Llama-3.1-Instruct.If the last
assistant
message has a trailing space, such as{'role': 'assistant', 'content': 'some text "}
and
continue_final_message
is True, it throws a "ValueError: substring not found"This is because in the
apply_chat_template
function, there is a linerendered_chat = rendered_chat[: rendered_chat.rindex(final_message) + len(final_message)].rstrip()
but
rendered_chat
ends with"some text<|eot_id|>"
while thefinal_message
still has the trailing space:"some text "
Who can help?
@ArthurZucker @itazap
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
See above
Expected behavior
I expect it to be able to continue after the trailing space
The text was updated successfully, but these errors were encountered: