You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How to get the incomplete/truncated response in case of json mode instead of (or in addition to) LengthFinishReasonError when the output max_tokens limit is met?
#28932
I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
Commit to Help
I commit to help with one of those options 👆
Example Code
fromlangchain_openaiimportChatOpenAIfromlangchain_core.promptsimportChatPromptTemplatellm=ChatOpenAI(model="gpt-4o", max_tokens=10)
prompt= [
("system", "You are a helpful assistant designed to output JSON. Respond to all question a json of the format {{winner: ..., info: ..., description: ...}}"),
("user", "{input}")
]
prompt=ChatPromptTemplate.from_messages(prompt)
llm=llm.bind(response_format={"type": "json_object"})
agent=prompt|llmagent.invoke({"input": "Who won the world series in 2020?"})
Description
I am trying to retrieve the truncated response from the completion when the output token limit is met in json_mode.
I expect to receive the part that has been generated before the limit was met, instead I receive LengthFinishReasonError and not any response.
In the older version of langchain and langchain-openai the behavior was different, instead of error the truncated response was returned with a finish_reason: 'length' in the response metadata.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Checked other resources
Commit to Help
Example Code
Description
I am trying to retrieve the truncated response from the completion when the output token limit is met in
json_mode
.I expect to receive the part that has been generated before the limit was met, instead I receive
LengthFinishReasonError
and not any response.In the older version of langchain and langchain-openai the behavior was different, instead of error the truncated response was returned with a
finish_reason: 'length'
in the response metadata.System Info
System Information
Package Information
Other Dependencies
Beta Was this translation helpful? Give feedback.
All reactions