Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming output and structured output cannot be supported at the same time #1569

Open
balala8 opened this issue Dec 14, 2024 · 5 comments
Open

Comments

@balala8
Copy link

balala8 commented Dec 14, 2024

If structured output is required, and str is returned directly without parsing, is it possible to support streaming output?

@manthanguptaa
Copy link
Contributor

@balala8 can you share your code as to what you exactly are trying to do

@ysolanky
Copy link
Contributor

Hello @balala8 ! Can you please also share your use case for streaming structured output? Disabling streaming for structured output is a conscious decision we made so curious to know your intended use case

@balala8
Copy link
Author

balala8 commented Dec 23, 2024

@balala8 can you share your code as to what you exactly are trying to do

Hello @balala8 ! Can you please also share your use case for streaming structured output? Disabling streaming for structured output is a conscious decision we made so curious to know your intended use case

Of course. I'm following OpenAI's example of supporting both structured output and streaming simultaneously. Here's my code, which is almost identical to OpenAI's implementation:

from typing import List
from pydantic import BaseModel
from openai import OpenAI

class EntitiesModel(BaseModel):
    attributes: List[str]
    colors: List[str]
    animals: List[str]

client = OpenAI()

with client.beta.chat.completions.stream(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "Extract entities from the input text"},
        {
            "role": "user",
            "content": "The quick brown fox jumps over the lazy dog with piercing blue eyes",
        },
    ],
    response_format=EntitiesModel,
) as stream:
    for event in stream:
        if event.type == "content.delta":
            if event.parsed is not None:
                # Print the parsed data as JSON
                print("content.delta parsed:", event.parsed)
        elif event.type == "content.done":
            print("content.done")
        elif event.type == "error":
            print("Error in stream:", event.error)

final_completion = stream.get_final_completion()
print("Final completion:", final_completion)

In my use case, I need the model to output a complex table. Since the output time is relatively long, I want users to see the first row of content quickly rather than watching a loading progress bar until all tokens are generated.

After reviewing phidata's code, I noticed that enabling both streaming and structured output was intentionally disabled. This prompted me to raise this issue. Additionally, I noticed that phidata's implementation of structured output doesn't use OpenAI SDK's response_format parameter. I'm curious about two things:

  1. Does this affect the performance of the model?
  2. Does it increase the likelihood of failures compared to using OpenAI's SDK directly?

@manthanguptaa
Copy link
Contributor

@balala8 can you share the agent config you built using Phidata? That would be more helpful for us

@balala8
Copy link
Author

balala8 commented Dec 26, 2024

@balala8 can you share the agent config you built using Phidata? That would be more helpful for us

Here's my code using phidata. I believe the main advantage of streaming output is that when my output is very long, users can quickly see the first token's output. Meanwhile, I also need structured output. Therefore, streaming output and structured output should be supported simultaneously.

from typing import List
from pydantic import BaseModel
from phi.model.openai import OpenAIChat
from phi.agent import Agent,RunResponse

class EntitiesModel(BaseModel):
    attributes: List[str]
    colors: List[str]
    animals: List[str]

if __name__=="__main__":
    agent = Agent(
        model=OpenAIChat(id='gpt-4o-mini'),
        response_model=EntitiesModel,
        structured_outputs=True,
        debug_mode=True,
    )
    response = agent.run("What are the colors of the rainbow?",
                                     stream=True)
    print(response)

Additionally, I noticed that phidata's implementation of structured output doesn't directly use response_format in OpenAI. What are the differences between these two approaches? Would they affect performance? I would greatly appreciate your insights on this matter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants