Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

End event contains wrong data when streaming structured output #7114

Open
5 tasks done
Stadly opened this issue Oct 30, 2024 · 4 comments · May be fixed by #7299
Open
5 tasks done

End event contains wrong data when streaming structured output #7114

Stadly opened this issue Oct 30, 2024 · 4 comments · May be fixed by #7299
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature

Comments

@Stadly
Copy link
Contributor

Stadly commented Oct 30, 2024

Checked other resources

  • I added a very descriptive title to this issue.
  • I searched the LangChain.js documentation with the integrated search.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangChain.js rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import { ChatOpenAI } from "@langchain/openai";

const model = new ChatOpenAI({
  modelName: "gpt-4o",
  streaming: true,
  streamUsage: true,
})
  .withStructuredOutput(
    {
      title: "Joke",
      description: "Joke to tell user.",
      type: "object",
      properties: {
        setup: {
          type: "string",
          description: "The setup for the joke",
        },
        punchline: {
          type: "string",
          description: "The joke's punchline",
        },
      },
      required: ["setup", "punchline"],
      additionalProperties: false,
    },
    {
      strict: true,
      method: "jsonSchema",
    },
  )
  .withConfig({ runName: "joke" });

const eventStream = model.streamEvents(
  "Tell me a joke about cats",
  { version: "v2" },
  { includeNames: ["joke"] },
);
for await (const event of eventStream) {
  console.log(event);
}

Error Message and Stack Trace (if applicable)

{
  "event": "on_chain_end",
  "data": {
    "output": {
      "setup": "WhyWhy wasWhy was theWhy was the catWhy was the cat sittingWhy was the cat sitting onWhy was the cat sitting on theWhy was the cat sitting on the computerWhy was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?Why was the cat sitting on the computer?",
      "punchline": "BecauseBecause itBecause it wantedBecause it wanted toBecause it wanted to keepBecause it wanted to keep anBecause it wanted to keep an eyeBecause it wanted to keep an eye onBecause it wanted to keep an eye on theBecause it wanted to keep an eye on the mouseBecause it wanted to keep an eye on the mouse!"
    }
  },
  "run_id": "6b7b38ba-d191-4278-a407-ce2287683205",
  "name": "joke",
  "tags": [],
  "metadata": {}
}

Description

The end event is a concatenation of all the intermediary messages, and not just the final result. If, for example, these are the streamed tokens: Why was the, the end event contains WhyWhy wasWhy was the instead of Why was the.

System Info

  • node: v20.18.0
  • @langchain/community: 0.3.11
  • @langchain/core: 0.3.16
  • @langchain/openai: 0.3.11
@dosubot dosubot bot added the auto:bug Related to a bug, vulnerability, unexpected error with an existing feature label Oct 30, 2024
Copy link

dosubot bot commented Oct 30, 2024

The issue you're experiencing with the on_chain_end event containing concatenated intermediary tokens instead of the final assembled output seems to be related to how the streaming and event handling are implemented in LangChain.js. The on_chain_end event is supposed to capture only the final output after all intermediary processing steps are completed [1][2].

To address this, you might want to ensure that the event handling logic correctly accumulates and processes the streamed tokens before emitting the on_chain_end event. This involves using a mechanism to buffer the tokens and assemble them into the final output before triggering the end event. The current behavior suggests that the tokens are being concatenated without proper handling of the final assembly.

If this is indeed a bug in the library, it might require a fix in the event handling logic to ensure that only the final, correctly assembled output is included in the on_chain_end event.

To continue talking to Dosu, mention @dosu.

@NoahCristino
Copy link

Hello, we are a group of students at the University of Toronto looking into fixing this issue.

@Stadly
Copy link
Contributor Author

Stadly commented Nov 23, 2024

Amazing, thanks!

@NoahCristino
Copy link

I have identified this to be an issue with the _streamIterator method of the RunnableSequence class within /src/runnables/base.ts. It looks like concatSupported is hardcoded to true and only the true case is implemented. I am almost finished adding support for concatSupported = false, which the ChatOpenAI model should be ran with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto:bug Related to a bug, vulnerability, unexpected error with an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants