Update SWE Readme (#267)

### **User description** Update to new flows ___ ### **PR Type** Documentation ___ ### **Description** - Updated the README to provide a comprehensive overview of the Composio-swe framework. - Revised the dependencies and installation steps for clarity. - Added a new "Getting started" section to guide users on scaffolding a new agent. - Included instructions for adding new local tools and shell tools.
ComposioHQ · Jul 9, 2024 · 9e8deeb · 9e8deeb
1 parent 90a3175
commit 9e8deeb
Show file tree

Hide file tree

Showing 3 changed files with 132 additions and 35 deletions.
diff --git a/python/swe/DEVELOPMENT.md b/python/swe/DEVELOPMENT.md
@@ -0,0 +1,44 @@
+# Extending Functionality of the SWE Agent
+
+This guide outlines the process for enhancing the SWE agent's capabilities by adding new tools or extending existing ones.
+
+> **Important**: Always run the SWE agent with `COMPOSIO_DEV_MODE=1` when adding new tools to ensure changes are reflected within the Docker container.
+
+## Adding a New Tool
+
+To incorporate a new local tool into the agent's toolkit:
+
+1. Consult the [Local Tool documentation](https://docs.composio.dev/sdk/python/local_tools) for detailed instructions.
+2. Follow the guidelines to integrate your tool seamlessly with the existing framework.
+
+## Implementing a New Shell Tool
+
+Shell tools are crucial for executing commands within the agent's environment. Here's how to add a new shell tool:
+
+### Key Features of Shell Sessions
+
+The agent supports multiple shell sessions, enabling:
+
+1. Dynamic creation of shell sessions
+2. Automatic use of the most recent active session
+3. Persistence of session-specific environments
+4. Seamless switching between sessions
+5. Efficient multi-tasking and context management
+
+### Implementation Steps
+
+For tools that need to execute in the active shell session (e.g., `git` commands, bash commands):
+
+1. Implement the following classes:
+
+   - `ShellRequest`
+   - `ShellExecResponse`
+   - `BaseExecCommand`
+
+2. Utilize the `exec_cmd` function to execute commands within the shell environment.
+
+### Example Implementation
+
+For a practical example of implementing a shell tool, refer to the [Git Patch Tool](https://github.com/ComposioHQ/composio/blob/master/python/composio/tools/local/shelltool/git_cmds/actions/get_patch.py). This example demonstrates how to structure your tool and integrate it with the agent's shell capabilities.
+
+By following these guidelines, you can effectively extend the SWE agent's functionality to suit your specific development needs.
diff --git a/python/swe/README.md b/python/swe/README.md
@@ -1,50 +1,93 @@
-# README for swe.py
+# SWE Development Kit
+
+## Table of Contents
+
+- [SWE Development Kit](#swe-development-kit)
+  - [Table of Contents](#table-of-contents)
+  - [Overview](#overview)
+  - [Dependencies](#dependencies)
+  - [Getting Started](#getting-started)
+    - [Creating a new agent](#creating-a-new-agent)
+    - [Docker Environment](#docker-environment)
+    - [Running the Benchmark](#running-the-benchmark)
 
 ## Overview
 
-The `swe.py` script is part of the Composio software engineering (SWE) agent framework.
-It is designed to automate tasks related to software development, including issue resolution, code reviews, and patch submissions using AI-driven agents.
+`Composio SWE` is a framework for building SWE agents on by utilising composio tooling ecosystem. Composio-SWE allows you to
+
+- Scaffold agents which works out-of-the-box with choice of your agentic framework, `crewai`, `llamaindex`, etc...
+- Tools to add or optimise your agent's abilities
+- Benchmark your agents against `SWE-bench`
 
 ## Dependencies
 
-1. Docker Desktop should be installed.
-2. Get the Github Access Token.
-3. Install the dependencies using `pip install -r requirements.txt`.
-4. Add the LLM configuration via `OPENAI_API_KEY` or `ANTHROPIC_API_KEY` or (`AZURE_OPENAI_API_KEY` and `AZURE_OPENAI_ENDPOINT`) environment variables.
-5. Add the environment variable `export GITHUB_ACCESS_TOKEN = <git_access_token>`.
-6. If you want to use Helicone, add the environment variable `export HELICONE_API_KEY = <helicone_api_key>`.
+Before getting started, ensure you have the following set up:
+
+1. **Installation**:
+
+   ```
+   pip install composio-swe composio-core
+   ```
+
+2. **Install agentic framework of your choice and the Composio plugin for the same**:
+   Here we're using `crewai` for the example:
+
+   ```
+   pip install crewai composio-crewai
+   ```
+
+3. **GitHub Access Token**:
+
+   The agent requires a github access token to work with your repositories, You can create one at https://github.com/settings/tokens with necessary permissions and export it as an environment variable using `export GITHUB_ACCESS_TOKEN=<your_token>`
+
+4. **LLM Configuration**:
+   You also need to setup API key for the LLM provider you're planning to use. By default the agents scaffolded by `composio-swe` uses `openai` client, so export `OPENAI_API_KEY` before running your agent
+
+## Getting Started
+
+### Creating a new agent
+
+1. Scaffold your agent using:
+
+   ```
+   composio-swe scaffold crewai -o <path>
+   ```
+
+   This creates a new agent in `<path>/agent` with four key files:
+
+   - `main.py`: Entry point to run the agent on your issue
+   - `agent.py`: Agent definition (edit this to customise behaviour)
+   - `prompts.py`: Agent prompts
+   - `benchmark.py`: SWE-Bench benchmark runner
 
-## Usage
+2. Run the agent:
+   ```
+   cd agent
+   python main.py
+   ```
+   You'll be prompted for the repository name and issue.
 
-To change the script quickly:
+### Docker Environment
 
-1. Change the issue_config in swe_run.py
-2. Run the script with `python swe_run.py`
+The SWE-agent runs in Docker by default for security and isolation. This sandboxes the agent's operations, protecting against unintended consequences of arbitrary code execution.
 
-To modify the agent and improve the agent's performance:
+To run locally instead, modify `workspace_env` in `agent/agent.py`. Use caution, as this bypasses Docker's protective layer.
 
-1. Modify the agent's code in swe.py
-2. Run the script with `python swe_run.py`
+### Running the Benchmark
 
-## Implementing your own SWE-Agent
+[SWE-Bench](https://www.swebench.com/) is a comprehensive benchmark designed to evaluate the performance of software engineering agents. It comprises a diverse collection of real-world issues from popular Python open-source projects, providing a robust testing environment.
 
-1. Create a new class that inherits from `BaseSWEAgent`.
-2. Implement the `__init__` method to initialize any dependencies that your agent requires and set the tools that your agent requires.
-3. Implement the `solve_issue` method to define the logic for solving the issue. This involves the agentic logic to solve the issue.
-4. For example, refer `crewai_agent.py` and `llama_agent.py` for implementing the agents.
-5. For implementing the tools, refer `composio/local_tools/local_workspace/workspace/tool.py` for implementing the tools.
+To run the benchmark:
 
-## Running the benchmark
+1. Ensure Docker is installed and running on your system.
+2. Execute the following command:
+   ```
+   cd agent
+   python benchmark.py --test-split=<test_split>
+   ```
+   - By default, `python benchmark.py` runs only 1 test instance.
+   - Specify a test split ratio to run more tests, e.g., `--test-split=1:300` runs 300 tests.
 
-1. Find the benchmark at `python/swe/benchmark`.
-2. To run the benchmark, run `python run_evaluation.py`.
-3. This will run the SWE-Bench (https://www.swebench.com/) benchmark for the agent. You need to init your agent inside the run_evaluation.py file.
-4. Flags:
-   1. `--test_split`: The test split range (e.g., 1:10).
-   2. `--print_only`: Print the issues only.
-   3. `--include_hints`: Include hints in the issue description.
+**Note**: We utilize [SWE-Bench-Docker](https://github.com/aorwall/SWE-bench-docker) to ensure each test instance runs in an isolated container with its specific environment and Python version.
 
-### Run Evaluation for the benchmark changes
-1. cd ~/composio/python/swe/benchmark
-2. ./complete_eval_workflow.sh <logs-path> princeton-nlp/SWE-bench_Lite
-"logs-path" = ~/.composio_coder/logs
+To extend the functionality of the SWE agent by adding new tools or extending existing ones, refer to the [Development Guide](DEVELOPMENT.md).
diff --git a/python/swe/composio_swe/scaffold/templates/crewai/benchmark.template b/python/swe/composio_swe/scaffold/templates/crewai/benchmark.template
@@ -1,6 +1,7 @@
 from agent import composio_toolset, crew
 from composio_swe.benchmark.run_evaluation import run_and_get_scores
 from composio_swe.config.store import IssueConfig
+import argparse
 
 
 def bench(workspace_id: str, issue_config: IssueConfig) -> str:
@@ -19,4 +20,13 @@ def bench(workspace_id: str, issue_config: IssueConfig) -> str:
 
 
 if __name__ == "__main__":
-    run_and_get_scores(bench, test_split="21:22")
+    parser = argparse.ArgumentParser(description="Run benchmark on the agent.")
+    parser.add_argument(
+        "--test-split",
+        type=str,
+        default="1:2",
+        help="Test split ratio (e.g. 1:2, 1:300) Maximum 300 tests per project.",
+    )
+    args = parser.parse_args()
+
+    run_and_get_scores(bench, test_split=args.test_split)