WIP - OpenAI Assistants Agent #4131

lspinheiro · 2024-11-11T06:53:16Z

Why are these changes needed?

Related issue number

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

lspinheiro · 2024-11-11T07:22:30Z

@ekzhu @jackgerrits , this is a very early draft, I have some questions before proceeding further.

What to do w.r.t. model client? The chat completion client abstraction doesn't seem to fit well because it seems to have some assumptions about the handling of messages in the interface that the assistants api has a very different approach of handling with the threads (I have spent a lot of time trying to adapt it without success). I'm also not sure if we can generate some general interface for agent-like apis, should I even create a specific one in autogen_ext to abstract away the openai sdk? I'm not sure what would be the value in that but it also feels like I'm adding an implementation without ap roper standard/abstraction.
How do we want to handle file search, specially ingestion? Also seems like something we don't have a strong abstraction for. I'm not sure if it should fit into how we will integrate rag or not. I also don't know if we want file ingestion to be part of the agent interaction api through on messages or not (maybe just a separate method to be used in the agent set up before running the chat?)
The next step for me is to map the tool calling to the autogen core framework. It looks like the azure openai api has integration with logic apps to actually call functions as tools. Should this be a future functionality in autogen_ext?

ekzhu · 2024-11-11T20:17:50Z

Thanks. I think we can follow the design in the Core cookbook for open ai assistant agent: https://microsoft.github.io/autogen/dev/user-guide/core-user-guide/cookbook/openai-assistant-agent.html. The API should be simple without introducing additional abstractions on our side.

class OpenAIAssistantAgent:
  name: str
  description: str
  client: openai.AsyncClient
  assistant_id: str
  thread_id: str
  tools: List[Tools] | None = None,
  code_interperter: ... | None = None, # configuration class from OpenAI client
  file_search: ... | None = None, # Configuration class from OpenAI client

We don't need to introduce additional abstractions because OpenAI Assistant is specific to OpenAI and Azure OpenAI services -- we should stick with the official clients they provide. Furthermore, we shouldn't expect the agent to be the only interface to the assistant features such as file search, and as the application may also perform other functions such as file upload and thread management.

What to do w.r.t. model client? The chat completion client abstraction doesn't seem to fit well because it seems to have some assumptions about the handling of messages in the interface that the assistants api has a very different approach of handling with the threads (I have spent a lot of time trying to adapt it without success). I'm also not sure if we can generate some general interface for agent-like apis, should I even create a specific one in autogen_ext to abstract away the openai sdk? I'm not sure what would be the value in that but it also feels like I'm adding an implementation without ap roper standard/abstraction.

Use the official openai client and do not introduce new abstractions on our side besides the new agent class.

How do we want to handle file search, specially ingestion? Also seems like something we don't have a strong abstraction for. I'm not sure if it should fit into how we will integrate rag or not. I also don't know if we want file ingestion to be part of the agent interaction api through on messages or not (maybe just a separate method to be used in the agent set up before running the chat?)

This should be mostly done using the official openai client in user's application. We can potentially add new assistant tools that can use the client.

The next step for me is to map the tool calling to the autogen core framework. It looks like the azure openai api has integration with logic apps to actually call functions as tools. Should this be a future functionality in autogen_ext?

We should make sure we can use our Tool class tools in this new agent.

Overall, the goal is to bring OpenAI assistant agents into our ecosystem, not to build a new wrapper around assistant API.

lpinheiroms and others added 4 commits October 29, 2024 18:59

initial assistant client draft

e35b407

expose assistants client

29ddbc1

initial openai assistant agentchat draft

c75cb31

Merge branch 'main' into lpinheiro/feat/add-openai-assistants-agent

8798a75

update file search

41219af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP - OpenAI Assistants Agent #4131

WIP - OpenAI Assistants Agent #4131

lspinheiro commented Nov 11, 2024

lspinheiro commented Nov 11, 2024

ekzhu commented Nov 11, 2024 •

edited

Loading

WIP - OpenAI Assistants Agent #4131

Are you sure you want to change the base?

WIP - OpenAI Assistants Agent #4131

Conversation

lspinheiro commented Nov 11, 2024

Why are these changes needed?

Related issue number

Checks

lspinheiro commented Nov 11, 2024

ekzhu commented Nov 11, 2024 • edited Loading

ekzhu commented Nov 11, 2024 •

edited

Loading