Skip to content

Commit

Permalink
Update SWE Readme (#267)
Browse files Browse the repository at this point in the history
### **User description**
Update to new flows
___

### **PR Type**
Documentation
___

### **Description**
- Updated the README to provide a comprehensive overview of the
Composio-swe framework.
- Revised the dependencies and installation steps for clarity.
- Added a new "Getting started" section to guide users on scaffolding a
new agent.
- Included instructions for adding new local tools and shell tools.
  • Loading branch information
kaavee315 authored Jul 9, 2024
1 parent 90a3175 commit 9e8deeb
Show file tree
Hide file tree
Showing 3 changed files with 132 additions and 35 deletions.
44 changes: 44 additions & 0 deletions python/swe/DEVELOPMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
# Extending Functionality of the SWE Agent

This guide outlines the process for enhancing the SWE agent's capabilities by adding new tools or extending existing ones.

> **Important**: Always run the SWE agent with `COMPOSIO_DEV_MODE=1` when adding new tools to ensure changes are reflected within the Docker container.
## Adding a New Tool

To incorporate a new local tool into the agent's toolkit:

1. Consult the [Local Tool documentation](https://docs.composio.dev/sdk/python/local_tools) for detailed instructions.
2. Follow the guidelines to integrate your tool seamlessly with the existing framework.

## Implementing a New Shell Tool

Shell tools are crucial for executing commands within the agent's environment. Here's how to add a new shell tool:

### Key Features of Shell Sessions

The agent supports multiple shell sessions, enabling:

1. Dynamic creation of shell sessions
2. Automatic use of the most recent active session
3. Persistence of session-specific environments
4. Seamless switching between sessions
5. Efficient multi-tasking and context management

### Implementation Steps

For tools that need to execute in the active shell session (e.g., `git` commands, bash commands):

1. Implement the following classes:

- `ShellRequest`
- `ShellExecResponse`
- `BaseExecCommand`

2. Utilize the `exec_cmd` function to execute commands within the shell environment.

### Example Implementation

For a practical example of implementing a shell tool, refer to the [Git Patch Tool](https://github.com/ComposioHQ/composio/blob/master/python/composio/tools/local/shelltool/git_cmds/actions/get_patch.py). This example demonstrates how to structure your tool and integrate it with the agent's shell capabilities.

By following these guidelines, you can effectively extend the SWE agent's functionality to suit your specific development needs.
111 changes: 77 additions & 34 deletions python/swe/README.md
Original file line number Diff line number Diff line change
@@ -1,50 +1,93 @@
# README for swe.py
# SWE Development Kit

## Table of Contents

- [SWE Development Kit](#swe-development-kit)
- [Table of Contents](#table-of-contents)
- [Overview](#overview)
- [Dependencies](#dependencies)
- [Getting Started](#getting-started)
- [Creating a new agent](#creating-a-new-agent)
- [Docker Environment](#docker-environment)
- [Running the Benchmark](#running-the-benchmark)

## Overview

The `swe.py` script is part of the Composio software engineering (SWE) agent framework.
It is designed to automate tasks related to software development, including issue resolution, code reviews, and patch submissions using AI-driven agents.
`Composio SWE` is a framework for building SWE agents on by utilising composio tooling ecosystem. Composio-SWE allows you to

- Scaffold agents which works out-of-the-box with choice of your agentic framework, `crewai`, `llamaindex`, etc...
- Tools to add or optimise your agent's abilities
- Benchmark your agents against `SWE-bench`

## Dependencies

1. Docker Desktop should be installed.
2. Get the Github Access Token.
3. Install the dependencies using `pip install -r requirements.txt`.
4. Add the LLM configuration via `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` or (`AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_ENDPOINT`) environment variables.
5. Add the environment variable `export GITHUB_ACCESS_TOKEN = <git_access_token>`.
6. If you want to use Helicone, add the environment variable `export HELICONE_API_KEY = <helicone_api_key>`.
Before getting started, ensure you have the following set up:

1. **Installation**:

```
pip install composio-swe composio-core
```

2. **Install agentic framework of your choice and the Composio plugin for the same**:
Here we're using `crewai` for the example:

```
pip install crewai composio-crewai
```

3. **GitHub Access Token**:

The agent requires a github access token to work with your repositories, You can create one at https://github.com/settings/tokens with necessary permissions and export it as an environment variable using `export GITHUB_ACCESS_TOKEN=<your_token>`

4. **LLM Configuration**:
You also need to setup API key for the LLM provider you're planning to use. By default the agents scaffolded by `composio-swe` uses `openai` client, so export `OPENAI_API_KEY` before running your agent

## Getting Started

### Creating a new agent

1. Scaffold your agent using:

```
composio-swe scaffold crewai -o <path>
```

This creates a new agent in `<path>/agent` with four key files:

- `main.py`: Entry point to run the agent on your issue
- `agent.py`: Agent definition (edit this to customise behaviour)
- `prompts.py`: Agent prompts
- `benchmark.py`: SWE-Bench benchmark runner

## Usage
2. Run the agent:
```
cd agent
python main.py
```
You'll be prompted for the repository name and issue.

To change the script quickly:
### Docker Environment

1. Change the issue_config in swe_run.py
2. Run the script with `python swe_run.py`
The SWE-agent runs in Docker by default for security and isolation. This sandboxes the agent's operations, protecting against unintended consequences of arbitrary code execution.

To modify the agent and improve the agent's performance:
To run locally instead, modify `workspace_env` in `agent/agent.py`. Use caution, as this bypasses Docker's protective layer.

1. Modify the agent's code in swe.py
2. Run the script with `python swe_run.py`
### Running the Benchmark

## Implementing your own SWE-Agent
[SWE-Bench](https://www.swebench.com/) is a comprehensive benchmark designed to evaluate the performance of software engineering agents. It comprises a diverse collection of real-world issues from popular Python open-source projects, providing a robust testing environment.

1. Create a new class that inherits from `BaseSWEAgent`.
2. Implement the `__init__` method to initialize any dependencies that your agent requires and set the tools that your agent requires.
3. Implement the `solve_issue` method to define the logic for solving the issue. This involves the agentic logic to solve the issue.
4. For example, refer `crewai_agent.py` and `llama_agent.py` for implementing the agents.
5. For implementing the tools, refer `composio/local_tools/local_workspace/workspace/tool.py` for implementing the tools.
To run the benchmark:

## Running the benchmark
1. Ensure Docker is installed and running on your system.
2. Execute the following command:
```
cd agent
python benchmark.py --test-split=<test_split>
```
- By default, `python benchmark.py` runs only 1 test instance.
- Specify a test split ratio to run more tests, e.g., `--test-split=1:300` runs 300 tests.

1. Find the benchmark at `python/swe/benchmark`.
2. To run the benchmark, run `python run_evaluation.py`.
3. This will run the SWE-Bench (https://www.swebench.com/) benchmark for the agent. You need to init your agent inside the run_evaluation.py file.
4. Flags:
1. `--test_split`: The test split range (e.g., 1:10).
2. `--print_only`: Print the issues only.
3. `--include_hints`: Include hints in the issue description.
**Note**: We utilize [SWE-Bench-Docker](https://github.com/aorwall/SWE-bench-docker) to ensure each test instance runs in an isolated container with its specific environment and Python version.

### Run Evaluation for the benchmark changes
1. cd ~/composio/python/swe/benchmark
2. ./complete_eval_workflow.sh <logs-path> princeton-nlp/SWE-bench_Lite
"logs-path" = ~/.composio_coder/logs
To extend the functionality of the SWE agent by adding new tools or extending existing ones, refer to the [Development Guide](DEVELOPMENT.md).
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from agent import composio_toolset, crew
from composio_swe.benchmark.run_evaluation import run_and_get_scores
from composio_swe.config.store import IssueConfig
import argparse


def bench(workspace_id: str, issue_config: IssueConfig) -> str:
Expand All @@ -19,4 +20,13 @@ def bench(workspace_id: str, issue_config: IssueConfig) -> str:


if __name__ == "__main__":
run_and_get_scores(bench, test_split="21:22")
parser = argparse.ArgumentParser(description="Run benchmark on the agent.")
parser.add_argument(
"--test-split",
type=str,
default="1:2",
help="Test split ratio (e.g. 1:2, 1:300) Maximum 300 tests per project.",
)
args = parser.parse_args()

run_and_get_scores(bench, test_split=args.test_split)

0 comments on commit 9e8deeb

Please sign in to comment.