"L2 Agent" #37

0x4007 · 2024-08-21T04:42:32Z

I was reading my friends blog post and was inspired to think about AI systems in a more structured way. They have these AI "level" designations.

It would be interesting to make an L2 agent according to the definition in the blog post:

L2 agents use LLMs selectively to decide how to handle key points in the program’s control flow.
Today, this often boils down to deciding which tool to invoke based on a set of tools which have been carefully curated by a human programmer.
The most common example of L2 agents today is invoking an LLM with access to tools in a while loop.
The majority of the program’s control flow still resides outside of the LLM’s purview and is controlled by a human programmer.

This is a stepping stone to L3 according to the blog because L3 coordinates L2 and below.

We can make this a command interface where we can tag the bot and ask for requests in plain language:

@ubiquity-os give me the wallet address of @0x4007

In the above example, we should pass the entire help menu to ChatGPT and it can invoke the correct plugin based on the command description.

I think this should be quite straightforward to implement, and is a useful stepping stone towards a more advanced AI powered system.

We can use ChatGPT 4o mini because this seems pretty simple to just look at the help menu.

Advanced Version

As a more advanced version of this plugin, we can listen for every comment (no bot tag required) and the bot can jump in to help if it thinks it can based on any comment. For example, if somebody asks to be assigned to a task, perhaps the bot can somehow invoke /start on behalf of that user (which inherits all of the checks, like if they are already assigned to too many other open tasks etc)

This makes the bot's presence much more pronounced, and it will truly feel like a helpful, and proactive member of the team instead of "a tool" that must be specifically called upon for help.

Remark

I suppose if it calls other plugins with LLMs (like conversation rewards, somehow) then technically this would be considered an L3 class system.

0x4007 · 2024-08-21T04:55:09Z

Seems that L4 requires the bot to write a custom plugin at runtime to handle a novel task.

This could be really interesting (and feasible) with automated CI checking.

I feel like this might be quite slow to run CI on every commit, but would be incredible to see it self build, save, and install a new plugin for future runs, which the L2 described in this specification would be able to invoke in the future.

If we could pull off L4 I'm sure we could go viral/trend in programmer news. Most of the infrastructure is in place, but making robust CI end-to-end tests seems like a month long project.

Keyrxng · 2024-08-21T18:54:07Z

I previously experimented with building a L2 agent using V1

This could be really interesting (and feasible) with automated CI checking.

I agree both v interesting and definitely feasible.

A "simple" V1 is possible if we map safe commands and safe direct actions that'll fire the intended plugins.

Concerns and Questions for V2:
Does V2 involve this plugin posting a slash command to GitHub issues, or does it operate by dispatching plugins directly? If it uses slash commands, there's a limitation because plugins will identify the bot as the sender not the actual user, which will break most, if not all plugins.

Implementation Strategy:
To include non-slash command capabilities we'd need to:

Provide the LLM with the manifest of each installed plugin
"Teach" the bot both our API and relevant parts of the GitHub API.
Use GitHub's workflow and repo dispatch when feasible; for other cases, enable the bot to build and execute API calls.

Operational Flow:

User queries are sent to OpenAI.
OpenAI determines if the response should trigger a function call or a simple text reply.
If a function is triggered, the arguments are sent to our tool handler.
After execution, responses are either posted directly to GitHub or returned to LLM for further processing.
The loop ends with the addCommentToIssue tool, which posts results back to GitHub. (or this would be after the LLM interaction has ended and we invoke it manually not as part of the LLM loop)

Challenges:

Automating the invocation of any installed plugin directly is complex as it requires detailed knowledge of all plugins and the ability to generate specific payloads for each.
Most slash commands posted by the bot will fail unless "safe"
Using the chat-api we'll "teach" via the tools we write for it. Streamlined, not as smart.
Using the assistants-api we could load it with entire API spec docs. Less-streamlined, far smarter.

Potential Development Paths:

V1 Safe Mode: Allow only pre-approved slash commands; convert all other commands into informative comments.
V1 Direct Action: Enable direct actions on issues (e.g., adding/removing assignees/labels) using parameterized API calls constructed by the LLM. As seen in the old QA I linked above. These actions would cause non-slash commands to fire such as assistive-pricing & task-xp-guard.
V1 Advanced Dispatch: Utilize workflow/repository dispatch which might be very tricky. Calling the kernel directly is trickier still because of the handshake verification etc but could probably be done.

0x4007 · 2024-08-22T00:38:52Z

Intuitively I believe that providing all the context and doing direct invocations (not writing the slash command) seems like the best approach. However, this can get expensive because it would require the larger model, and we would be using a lot of context.

We probably would need to rely on tagging the bot if this is the case which is not as interesting.

I was under the impression that we have standardized payload interfaces for all of the plugins, and that we just need to understand the help menu of each plugin.

0x4007 · 2024-12-05T15:44:31Z

/start

ubiquity-os-beta · 2024-12-05T15:44:40Z

! This task does not reflect a business priority at the moment. You may start tasks with one of the following labels: Priority: 3 (High), Priority: 4 (Urgent), Priority: 5 (Emergency)

0x4007 · 2024-12-05T15:45:09Z

/start

ubiquity-os-beta · 2024-12-05T15:45:29Z

Warning! This task was created over 106 days ago. Please confirm that this issue specification is accurate before starting. Deadline Fri, Dec 6, 3:45 PM UTC Beneficiary 0x4007CE2083c7F3E18097aeB3A39bb8eC149a341d

Tip

Use /wallet 0x0000...0000 if you want to update your registered payment wallet address.
Be sure to open a draft pull request as soon as possible to communicate updates on your progress.
Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.

0x4007 · 2024-12-05T15:53:28Z

/stop

ubiquity-os-beta · 2024-12-05T15:53:45Z

! Adding a label to issue failed!

0x4007 added Priority: 1 (Normal) Time: <1 Day labels Aug 21, 2024

ubiquity-os bot added the Price: 100 USD label Aug 21, 2024

devpool-directory-superintendent bot mentioned this issue Aug 29, 2024

"L2 Agent" ubiquity/devpool-directory#1414

Open

Keyrxng mentioned this issue Nov 2, 2024

Model Prompt rewrite ubiquity-os-marketplace/command-ask#31

Draft

0x4007 added Priority: 3 (High) and removed Priority: 1 (Normal) labels Dec 5, 2024

ubiquity-os-beta bot assigned 0x4007 Dec 5, 2024

ubiquity-os-beta bot added Price: 600 USD and removed Price: 100 USD labels Dec 5, 2024

0x4007 added Priority: 1 (Normal) and removed Priority: 3 (High) labels Dec 5, 2024

ubiquity-os-beta bot unassigned 0x4007 Dec 5, 2024

ubiquity-os-beta bot added Price: 200 USD and removed Price: 600 USD labels Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"L2 Agent" #37

"L2 Agent" #37

0x4007 commented Aug 21, 2024 •

edited

Loading

0x4007 commented Aug 21, 2024 •

edited

Loading

Keyrxng commented Aug 21, 2024 •

edited

Loading

0x4007 commented Aug 22, 2024

0x4007 commented Dec 5, 2024

ubiquity-os-beta bot commented Dec 5, 2024

0x4007 commented Dec 5, 2024

ubiquity-os-beta bot commented Dec 5, 2024

0x4007 commented Dec 5, 2024

ubiquity-os-beta bot commented Dec 5, 2024

"L2 Agent" #37

"L2 Agent" #37

Comments

0x4007 commented Aug 21, 2024 • edited Loading

Advanced Version

Remark

0x4007 commented Aug 21, 2024 • edited Loading

Keyrxng commented Aug 21, 2024 • edited Loading

0x4007 commented Aug 22, 2024

0x4007 commented Dec 5, 2024

ubiquity-os-beta bot commented Dec 5, 2024

0x4007 commented Dec 5, 2024

ubiquity-os-beta bot commented Dec 5, 2024

0x4007 commented Dec 5, 2024

ubiquity-os-beta bot commented Dec 5, 2024

0x4007 commented Aug 21, 2024 •

edited

Loading

0x4007 commented Aug 21, 2024 •

edited

Loading

Keyrxng commented Aug 21, 2024 •

edited

Loading