Official Repository of "GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration".
- Paper Link: ([2410.18032] GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration)
Graphs are widely used for modeling relational data in real-world scenarios, such as social networks and urban computing. While large language models (LLMs) have achieved strong performance in many areas, existing LLM-based graph analysis approaches either integrate graph neural networks (GNNs) for specific machine learning tasks (e.g., node classification), limiting their transferability, or rely solely on LLMs’ internal reasoning ability, resulting in suboptimal performance.
To address these limitations, we leverage recent advances in LLM-based agents, which have demonstrated the capability to utilize external knowledge or tools for problem-solving. By simulating human problem-solving strategies such as analogy and collaboration, we propose a multi-agent system based on LLMs named GraphTeam for graph analysis.
GraphTeam consists of five LLM-based agents from three modules, where agents with different specialties collaborate to address complex problems. Specifically:
-
Input-Output Normalization Module:
- The Question Agent extracts and refines four key arguments (e.g., graph type and output format) from the original question to facilitate problem understanding.
- The Answer Agent organizes the results to meet the output requirements.
-
External Knowledge Retrieval Module:
- We build a knowledge base consisting of relevant documentation and experience information.
- The Search Agent retrieves the most relevant entries from the knowledge base for each question.
-
Problem-Solving Module:
- Given the retrieved information from the Search Agent, the Coding Agent uses established algorithms via programming to generate solutions.
- If the Coding Agent fails, the Reasoning Agent directly computes the results without programming.
Extensive experiments on six graph analysis benchmarks demonstrate that GraphTeam achieves state-of-the-art performance with an average 25.85% improvement over the best baseline in terms of accuracy.
The overall pipeline of our multi-agent system GraphTeam (left), and the comparison between GraphTeam and state-of-the-art baseline on six benchmarks (right).
The overall framework of GraphTeam, which includes five agents from three functional groups.
Performance with respect to different task categories.
Performance with respect to different output formats.
Hyper-parameter analysis of four hyper-parameter in the proposed GraphTeam
Performance comparison on six graph analysis benchmarks in terms of accuracy (%).
- Operating System : Compatible with Windows , Linux , and macOS . Note : AutoGL only supports x86 platforms, so Macs with M-series chips cannot run GNN_benchmark.
- Conda: Installed
- Docker: Installed and running
First, create a Conda virtual environment with a specified Python version.
conda create -n myenv python=3.10.14
Activate the virtual environment:
conda activate myenv
With the virtual environment activated, run the following command to install the project dependencies:
pip install -r requirements.txt
Docker is used to execute code after it is generated. Follow these steps:
docker pull chuqizhi72/execute_agent_environment:latest
docker create --name test chuqizhi72/execute_agent_environment:latest
Ensure that the Conda virtual environment is activated. If not, run:
conda activate myenv
Ensure the Docker container is started. If not, run:
docker start test
docker exec -it test /bin/bash
Within the activated virtual environment, navigate to the project directory, ensure your current working directory is set to multi-agents-4-graph-analysis
, and set your OpenAI API key in run.py
. Then, run run.py
:
cd multi-agents-4-graph-analysis
Setting the OpenAI API Key:
-
Open
run.py
located atmulti-agents-4-graph-analysis/GraphTeam/run.py
in your preferred text editor. -
Locate the line where the OpenAI API key is set. It should look like this:
os.environ['OPENAI_API_KEY'] = 'your-api-key-here'
-
Replace
'your-api-key-here'
with your actual OpenAI API key:os.environ['OPENAI_API_KEY'] = 'sk-your-openai-api-key'
Running the Script:
After setting the API key, execute the script from the multi-agents-4-graph-analysis
directory:
python GraphTeam/run.py
Note for Running the NLGraph Benchmark:
The project includes an answer_format_dict
that specifies the required output format for different problem types. To ensure consistency and accuracy in the results when running the NLGraph benchmark, you need to modify the run_threaded
function in the run.py
file.
-
Open
run.py
located atmulti-agents-4-graph-analysis/GraphTeam/run.py
in your preferred text editor. -
Locate the
run_threaded
function and answer_format_dict. -
Find the following commented lines within the function:
# if is NLGraph, the question should add output format # question = question + answer_format_dict[category_data['type'][i]]
-
Uncomment these lines by removing the
#
symbols:# if is NLGraph, the question should add output format question = question + answer_format_dict[category_data['type'][i]]
This modification ensures that each question includes the appropriate output format directive, guiding the system to format the output correctly and enhancing the reliability of the results.
Solution: Ensure all dependencies are correctly installed and that both the Conda environment and Docker are activated. Check the paths and configurations in run.py
to ensure they are correct. Additionally, verify that you have set your OpenAI API key correctly in run.py
.
Solution: When running the project, ensure that your current working directory is set to multi-agents-4-graph-analysis
. This ensures that all relative paths and configurations function correctly.
Solution: The relevant documentation is located in the data
directory of the project. The relevant documentation is located in the memory
directory of the project. Ensure that all relevant files are present and properly formatted.
The project includes an answer_format_dict
that specifies the required output format for different problem types. To ensure consistency and accuracy in the results when running the NLGraph benchmark, you need to modify the run_threaded
function in the run.py
file.
-
Open
run.py
located atmulti-agents-4-graph-analysis/GraphTeam/run.py
in your preferred text editor. -
Locate the
run_threaded
function. -
Find the following commented lines within the function:
# if is NLGraph, the question should add output format # question = question + answer_format_dict[category_data['type'][i]]
-
Uncomment these lines by removing the
#
symbols:# if is NLGraph, the question should add output format question = question + answer_format_dict[category_data['type'][i]]
This modification ensures that each question includes the appropriate output format directive, guiding the system to format the output correctly and enhancing the reliability of the results.
We would like to acknowledge the following contributors for their valuable support and contributions to the GraphTeam project:
- Yubin Chen (1shuimo)
- Zekai Yu (yuzekai1234)
- Yang Liu (AckerlyLau)
- Yaoqi Liu (dddg617)
Their dedication and expertise have been instrumental in the development and success of this project.