geekan · didiforgithub · Jun 29, 2024 · Jun 29, 2024 · Jul 1, 2024 · Jul 1, 2024
diff --git a/.gitignore b/.gitignore
@@ -188,3 +188,4 @@ cov.xml
 *-structure.json
 *.dot
 .python-version
+*.csv
diff --git a/examples/aflow/README.md b/examples/aflow/README.md
@@ -0,0 +1,55 @@
+# AFlow: Automating Agentic Workflow Generation
+
+AFlow is a framework for automatically generating and optimizing Agentic Workflows. It uses Monte Carlo tree search in a code-represented workflow space to find effective workflows, replacing manual development with machine effort. Our approach shows potential to outperform handcrafted workflows on various tasks. 
+
+[Read our paper on arXiv](https://arxiv.org/abs/2410.10762)
+
+## Framework Components
+
+- **Node**: Basic unit of LLM invocation. See `metagpt/actions/action_node.py` for a flexible interface to control LLM, temperature, format, and prompt.
+- **Operator**: Predefined combinations of Nodes to enhance search efficiency. Encapsulates common operations like Generate, Format, Review, Revise, Ensemble, Test, and Programmer. See `metagpt/ext/aflow/operator.py` for details. You can customize your own Operator by referencing the implementations in this code.
+- **Workflow**: A sequence of LLM-invoking nodes connected by edges. Can be represented as graphs, neural networks, or code to express various execution structures. See `metagpt/ext/aflow/workflow.py` for our implementation.
+- **Optimizer**: Uses LLMs within a Monte Carlo Tree Search variant to explore and refine workflows. Iteratively selects, expands, evaluates, and updates workflows based on performance. See `metagpt/ext/aflow/scripts/optimizer.py` for details.
+- **Evaluator**: Assesses workflow performance on given tasks. Provides feedback to guide the optimization process towards more effective workflows. See `metagpt/ext/aflow/scripts/evaluator.py` for details.
+
+## Datasets
+
+### Experimental Datasets
+We conducted experiments on six datasets (HumanEval, MBPP, GSM8K, MATH, HotpotQA, DROP) and provide their evaluation code. The data can be found in this [datasets](https://drive.google.com/uc?export=download&id=1DNoegtZiUhWtvkd2xoIuElmIi4ah7k8e) link, or you can download them using `metagpt/ext/aflow/data/download_data.py`
+
+### Custom Datasets
+For custom tasks, you can reference the code in the metagpt/ext/aflow/benchmark folder. Inherit the `BaseBenchmark` class and implement `evaluate_problem`, `calculate_score`, and `get_result_columns` to add your custom dataset benchmark. Then, add your benchmark name in `metagpt/ext/aflow/scripts/evaluator.py` and `metagpt/ext/aflow/scripts/optimizer.py` to find effective workflows for your custom dataset.
+
+## Quick Start
+
+1. Configure your search in `optimize.py`:
+   - Open `examples/aflow/optimize.py`
+   - Set the following parameters:
+     ```python
+     dataset = "HumanEval"  # Choose from: "HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP" or your custom dataset name
+     question_type = "code"  # Choose from: "math", "code", "qa"
+     sample = 4  # Number of samples to use for optimization
+     check_convergence = True  # Whether to check for convergence
+     optimized_path = "path/to/optimized/workflows"  # Path to save optimized workflows, defaults to metagpt/ext/aflow/scripts/optimized
+     initial_round = 1  # Starting round number
+     max_rounds = 20  # Maximum number of optimization rounds
+     ```
+   - Adjust these parameters according to your specific requirements and dataset
+2. Set up parameters in `config/config2.yaml` (see `examples/aflow/config2.example.yaml` for reference)
+3. Set the operator you want to use in `optimize.py` and in `optimized_path/template/operator.py`, `optimized_path/template/operator.json`. You can reference our implementation to add operators for specific datasets
+4. When you first run, you can download the datasets and initial rounds by setting `download(["datasets", "initial_rounds"])` in `examples/aflow/optimize.py`
+5. (Optional) Add your custom dataset and corresponding evaluation function following the [Custom Datasets](#custom-datasets) section
+6. Run `python examples/aflow/optimize.py` to start the optimization process!
+
+## Citation
+
+If you use AFlow in your research, please cite our paper:
+
+```
+@article{zhang2024aflow,
+  title={AFlow: Automating Agentic Workflow Generation},
+  author={Zhang, Jiayi and Xiang, Jinyu and Yu, Zhaoyang and Teng, Fengwei and Chen, Xionghui and Chen, Jiaqi and Zhuge, Mingchen and Cheng, Xin and Hong, Sirui and Wang, Jinlin and others},
+  journal={arXiv preprint arXiv:2410.10762},
+  year={2024}
+}
+```
diff --git a/examples/aflow/config2.example.yaml b/examples/aflow/config2.example.yaml
@@ -0,0 +1,12 @@
+models:
+ "<model_name>": # model: "gpt-4-turbo"  # or gpt-3.5-turbo
+   api_type: "openai"  # or azure / ollama / groq etc.
+   base_url: "<your base url>" 
+   api_key: "<your api key>"
+   temperature: 0
+ "<model_name>":  
+   api_type: "openai"  
+   base_url: "<your base url>"
+   api_key: "<your api key>"
+   temperature: 0
+CALC_USAGE: True 
diff --git a/examples/aflow/optimize.py b/examples/aflow/optimize.py
@@ -0,0 +1,60 @@
+# -*- coding: utf-8 -*-
+# @Date    : 8/23/2024 20:00 PM
+# @Author  : didi
+# @Desc    : Entrance of AFlow.
+
+
+from metagpt.configs.models_config import ModelsConfig
+from metagpt.ext.aflow.data.download_data import download
+from metagpt.ext.aflow.scripts.optimizer import DatasetType, Optimizer, QuestionType
+
+# DatasetType, QuestionType, and OptimizerType definitions
+# DatasetType = Literal["HumanEval", "MBPP", "GSM8K", "MATH", "HotpotQA", "DROP"]
+# QuestionType = Literal["math", "code", "qa"]
+# OptimizerType = Literal["Graph", "Test"]
+
+# When you fisrt use, please download the datasets and initial rounds; If you want to get a look of the results, please download the results.
+download(["datasets", "initial_rounds"])
+
+# Crucial Parameters
+dataset: DatasetType = "GSM8K"  # Ensure the type is consistent with DatasetType
+sample: int = 4  # Sample Count, which means how many workflows will be resampled from generated workflows
+question_type: QuestionType = "code"  # Ensure the type is consistent with QuestionType
+optimized_path: str = "metagpt/ext/aflow/scripts/optimized"  # Optimized Result Save Path
+initial_round: int = 1  # Corrected the case from Initial_round to initial_round
+max_rounds: int = 20
+check_convergence: bool = True
+
+# Config llm model, you can modify `config/config2.yaml` to use more llms.
+mini_llm_config = ModelsConfig.default().get("gpt-4o-mini")
+claude_llm_config = ModelsConfig.default().get("claude-3-5-sonnet-20240620")
+
+# Config operators.
+operators = [
+    "Custom",  # It's basic unit of a fixed node. optimizer can modify its prompt to get vairous nodes.
+    # "AnswerGenerate"              # It's for qa
+    # "CustomCodeGenerate",         # It's for code
+    "ScEnsemble",  # It's for code, math and qa
+    # "Test",                       # It's for code
+    "Programmer",  # It's for math
+]
+
+# Create an optimizer instance
+optimizer = Optimizer(
+    dataset=dataset,  # Config dataset
+    question_type=question_type,  # Config Question Type
+    opt_llm_config=claude_llm_config,  # Config Optimizer LLM
+    exec_llm_config=mini_llm_config,  # Config Execution LLM
+    check_convergence=check_convergence,  # Whether Early Stop
+    operators=operators,  # Config Operators you want to use
+    optimized_path=optimized_path,  # Config Optimized workflow's file path
+    sample=sample,  # Only Top(sample) rounds will be selected.
+    initial_round=initial_round,  # Optimize from initial round
+    max_rounds=max_rounds,  # The max iteration of AFLOW.
+)
+
+if __name__ == "__main__":
+    # Optimize workflow via setting the optimizer's mode to 'Graph'
+    optimizer.optimize("Graph")
+    # Test workflow via setting the optimizer's mode to 'Test'
+    # optimizer.optimize("Test")
diff --git a/metagpt/actions/action_node.py b/metagpt/actions/action_node.py
@@ -9,6 +9,7 @@
   we can use typing to extract the type of the node, but we cannot use built-in list to extract.
 """
 import json
+import re
 import typing
 from enum import Enum
 from typing import Any, Dict, List, Optional, Tuple, Type, Union
@@ -18,6 +19,7 @@
 
 from metagpt.actions.action_outcls_registry import register_action_outcls
 from metagpt.const import USE_CONFIG_TIMEOUT
+from metagpt.ext.aflow.scripts.utils import sanitize
 from metagpt.llm import BaseLLM
 from metagpt.logs import logger
 from metagpt.provider.postprocess.llm_output_postprocess import llm_output_postprocess
@@ -38,9 +40,17 @@ class ReviseMode(Enum):
 
 TAG = "CONTENT"
 
+
+class FillMode(Enum):
+    CODE_FILL = "code_fill"
+    XML_FILL = "xml_fill"
+    SINGLE_FILL = "single_fill"
+
+
 LANGUAGE_CONSTRAINT = "Language: Please use the same language as Human INPUT."
 FORMAT_CONSTRAINT = f"Format: output wrapped inside [{TAG}][/{TAG}] like format example, nothing else."
 
+
 SIMPLE_TEMPLATE = """
 ## context
 {context}
@@ -471,6 +481,113 @@ async def simple_fill(
 
         return self
 
+    def get_field_name(self):
+        """
+        Get the field name from the Pydantic model associated with this ActionNode.
+        """
+        model_class = self.create_class()
+        fields = model_class.model_fields
+
+        # Assuming there's only one field in the model
+        if len(fields) == 1:
+            return next(iter(fields))
+
+        # If there are multiple fields, we might want to use self.key to find the right one
+        return self.key
+
+    def get_field_names(self):
+        """
+        获取与此ActionNode关联的Pydantic模型的字段名称。
+        """
+        model_class = self.create_class()
+        return model_class.model_fields.keys()
+
+    def get_field_types(self):
+        """
+        获取与此ActionNode关联的Pydantic模型的字段类型。
+        """
+        model_class = self.create_class()
+        return {field_name: field.annotation for field_name, field in model_class.model_fields.items()}
+
+    def xml_compile(self, context):
+        # TODO 再来一版
+
+        field_names = self.get_field_names()
+        # Construct the example using the field names
+        examples = []
+        for field_name in field_names:
+            examples.append(f"<{field_name}>content</{field_name}>")
+
+        # Join all examples into a single string
+        example_str = "\n".join(examples)
+        # Add the example to the context
+        context += f"""
+### Response format (must be strictly followed): All content must be enclosed in the given XML tags, ensuring each opening <tag> has a corresponding closing </tag>, with no incomplete or self-closing tags allowed.\n
+{example_str}
+"""
+        return context
+
+    async def code_fill(self, context, function_name=None, timeout=USE_CONFIG_TIMEOUT):
+        """
+        Fill CodeBlock Using ``` ```
+        """
+        field_name = self.get_field_name()
+        prompt = context
+        content = await self.llm.aask(prompt, timeout=timeout)
+        extracted_code = sanitize(code=content, entrypoint=function_name)
+        result = {field_name: extracted_code}
+        return result
+
+    async def single_fill(self, context):
+        field_name = self.get_field_name()
+        prompt = context
+        content = await self.llm.aask(prompt)
+        result = {field_name: content}
+        return result
+
+    async def xml_fill(self, context):
+        """
+        使用XML标签填充上下文并根据字段类型进行转换，包括字符串、整数、布尔值、列表和字典类型
+        """
+        field_names = self.get_field_names()
+        field_types = self.get_field_types()
+
+        extracted_data = {}
+        content = await self.llm.aask(context)
+
+        for field_name in field_names:
+            pattern = rf"<{field_name}>(.*?)</{field_name}>"
+            match = re.search(pattern, content, re.DOTALL)
+            if match:
+                raw_value = match.group(1).strip()
+                field_type = field_types.get(field_name)
+
+                if field_type == str:
+                    extracted_data[field_name] = raw_value
+                elif field_type == int:
+                    try:
+                        extracted_data[field_name] = int(raw_value)
+                    except ValueError:
+                        extracted_data[field_name] = 0  # 或者其他默认值
+                elif field_type == bool:
+                    extracted_data[field_name] = raw_value.lower() in ("true", "yes", "1", "on", "True")
+                elif field_type == list:
+                    try:
+                        extracted_data[field_name] = eval(raw_value)
+                        if not isinstance(extracted_data[field_name], list):
+                            raise ValueError
+                    except:
+                        extracted_data[field_name] = []  # 默认空列表
+                elif field_type == dict:
+                    try:
+                        extracted_data[field_name] = eval(raw_value)
+                        if not isinstance(extracted_data[field_name], dict):
+                            raise ValueError
+                    except:
+                        extracted_data[field_name] = {}  # 默认空字典
+
+        return extracted_data
+
     async def fill(
         self,
         context,
@@ -481,6 +598,7 @@ async def fill(
         images: Optional[Union[str, list[str]]] = None,
         timeout=USE_CONFIG_TIMEOUT,
         exclude=[],
+        function_name: str = None,
     ):
         """Fill the node(s) with mode.
 
@@ -507,6 +625,22 @@ async def fill(
         if self.schema:
             schema = self.schema
 
+        if mode == FillMode.CODE_FILL.value:
+            result = await self.code_fill(context, function_name, timeout)
+            self.instruct_content = self.create_class()(**result)
+            return self
+
+        elif mode == FillMode.XML_FILL.value:
+            context = self.xml_compile(context=self.context)
+            result = await self.xml_fill(context)
+            self.instruct_content = self.create_class()(**result)
+            return self
+
+        elif mode == FillMode.SINGLE_FILL.value:
+            result = await self.single_fill(context)
+            self.instruct_content = self.create_class()(**result)
+            return self
+
         if strgy == "simple":
             return await self.simple_fill(schema=schema, mode=mode, images=images, timeout=timeout, exclude=exclude)
         elif strgy == "complex":