[playground][dogfooding] initial playground dogfooding check list #5373

Parker-Stafford · 2024-11-15T01:09:39Z

Datasets

(Tony + Xander)

test on large datasets
streaming
non streaming (ui side)
[stretch] test behind proxy with some timeout

Tools

(Roger + Parker)

tool switching for calls and schemas between providers

Tool calling

Does the playground allow you to simulate an actual tool call (user message with tool, assistant message with tool calls, tool message with results, proper response from llm)

tool calling for anthropic
tool calling for openai

Tool use

Does the playground allow you to add tools that can be appropriately picked out by an llm (add a tool with a relevant message)

tool use for anthropic
tool use for openai

Template application

mustache (datasets / normal)
fstring (datasets / normal)

Span replay

(Xander + Parker)

Structured output

Parker + Roger

Known issues / WIP

Playground spans

UI

Server

New Issues

Lower prio

[playground][instrumentation] strip out internal graphql / playground param from input.value in playground spans #5393

github-project-automation bot added this to phoenix Nov 15, 2024

Parker-Stafford mentioned this issue Nov 15, 2024

🗺 prompt playground #3435

Open

github-project-automation bot moved this to 📘 Todo in phoenix Nov 15, 2024

Parker-Stafford assigned axiomofjoy, RogerHYang, cephalization, anticorrelator and Parker-Stafford Nov 15, 2024

RogerHYang moved this from 📘 Todo to 👨‍💻 In progress in phoenix Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[playground][dogfooding] initial playground dogfooding check list #5373

[playground][dogfooding] initial playground dogfooding check list #5373

Parker-Stafford commented Nov 15, 2024 •

edited

Loading

[playground][dogfooding] initial playground dogfooding check list #5373

[playground][dogfooding] initial playground dogfooding check list #5373

Comments

Parker-Stafford commented Nov 15, 2024 • edited Loading

Datasets

Tools

Tool calling

Tool use

Template application

Span replay

Structured output

Known issues / WIP

Playground spans

UI

Server

New Issues

Lower prio

Parker-Stafford commented Nov 15, 2024 •

edited

Loading