流式响应

设置 stream: true 启用 Server-Sent Events (SSE) 流式输出,实现打字机效果。

Python 流式示例

stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "讲个故事"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Node.js 流式示例

const stream = await client.chat.completions.create({
    model: 'gpt-4o',
    messages: [{ role: 'user', content: '讲个故事' }],
    stream: true,
});

for await (const chunk of stream) {
    const content = chunk.choices[0]?.delta?.content;
    if (content) process.stdout.write(content);
}

curl 流式测试

curl https://api.bufferapi.com/v1/chat/completions \
  -H "Authorization: Bearer sk-your-key" \
  -H "Content-Type: application/json" \
  -d '{"model":"gpt-4o","messages":[{"role":"user","content":"Hi"}],"stream":true}'

SSE 数据格式

data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"你"},"index":0}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"好"},"index":0}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{"content":"!"},"index":0}]}
data: {"id":"chatcmpl-xxx","choices":[{"delta":{},"finish_reason":"stop","index":0}]}
data: [DONE]
提示:流式模式下 Token 计费与非流式完全一致。流式仅影响数据传输方式,不影响模型推理。