Skip to content

Commit

Permalink
Update MetaGPT X Technical Report.md
Browse files Browse the repository at this point in the history
  • Loading branch information
stellaHSR authored Aug 21, 2024
1 parent 74ffe66 commit 7b4f466
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions src/blog/swebench/MetaGPT X Technical Report.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,15 +34,15 @@ A multi-agent system excels at decoupling roles and utilizing contextual informa

1. **Reproducer**: The first step in addressing an issue described in natural language or traceback information is to reproduce it by analyzing the issue and using the relevant code. This process is critical for pinpointing the bug, providing contextual information, and observing the runtime behavior to confirm the existence of the problem. During this phase, the reproducer uses search tools to locate relevant code, editing tools to generate reproduction script, runs the script, and then analyzes the observation to confirm the successful issue reproduction.
2. **Coder**: Based on the upstream reproduction results, the coder performs more granular code searches, including symbol-level identification and navigation, while exploring definitions, signatures, docstrings, and call graphs. The coder can click on symbol to quickly access relevant code blocks. At the file level, the coder can obtain the hierarchy structure of the opened script, retrieving complete symbol information within a limited observation window to improve navigation efficiency. During real-time editing, the coder utilizes diagnostics, auto-checks formatting, and includes full line numbers at the code block level to reduce indentation and formatting errors. Additionally, the coder reflects on multiple failed edits by revisiting the context of the error-prone sections or generating in-line comments to enhance the success rate of code modifications.
3. **Verifier**: After the coder generates a patch, the verifier conducts integration tests on both the new patch and existing tests. Notably, the verifier did not have access to the test patch provided by the task, consistent with setups in other frameworks. The verifier performs searches based on patch analysis, extracting symbols from the patch, directly navigating to relevant code blocks, and locating associated test files. It then modifies the code as needed and reflects on runtime debugging. The verifier runs existing regressions tests, reproduces script, or generates new test code to further confirm whether the patch successfully resolves the current issue. If regression tests fail, the execution feedback is propagated back to the verifier to reflect on the feedback and patch.
3. **Verifier**: After the coder generates a patch, the verifier conducts integration tests on both the new patch and existing tests. Notably, the verifier does not have access to the test patch provided by the task, consistent with setups in other frameworks. The verifier performs searches based on patch analysis, extracting symbols from the patch, directly navigating to relevant code blocks, and locating associated test files. It then modifies the code as needed and reflects on runtime debugging. The verifier runs existing regressions tests, reproduces script, or generates new test code to further confirm whether the patch successfully resolves the current issue. If regression tests fail, the execution feedback is propagated back to the verifier to reflect on the feedback and patch.
4. **Selector**: During the issue-solving process, the selector prompts the coder and verifier to generate multiple patches using different LLMs when the model patch is empty or there is no submission in the trajectory. The selector then applies a selection process based on the given patches and problem statement, shuffling the patches multiple times and using a majority voting mechanism to determine the best solution.

## Repo Understanding and Advanced Tools

1. Repo parser: We enhance repository understanding by constructing a repo-graph using tree-sitter to index codebases, capturing symbols, relationships, and structural metadata like symbol names, signatures, and call graphs. Our tool supports multi-granularity searches (e.g., variables, functions, classes) and returns detailed metadata, facilitating code structure-aware context retrieval and navigation based on issue descriptions as well as surrounding context.
2. Search tools: Unlike other file or keyword based search tools, we improve information density within limited windows by offering code block previews with line numbers upon match, allowing agents to gather extensive code context. We simulate an IDE-like environment for actively navigated files, providing hierarchical structure information (classes, functions, methods) with line numbers and indentation. Enhanced click-to-navigate functionality for symbols allows agents to directly expand and view details. This improves efficiency in symbol information collection and reduces redundant navigation, particularly in large files and complex dependency hierarchies.
3. Code semantic search: To enhance search quality in codebases rich with symbols, we iteratively chunk repositories at the method level and embed each code block using an embedding model. This enables our agents to locate buggy code by retrieving relevant code spans and associated file information from a code semantic perspective, complementing the limitations of the previous two search methods.
4. Enhanced editor: We've upgraded the file editor to include code block contexts with line numbers, significantly reducing syntax errors and indentation issues that frequently occur during code generation.
4. Enhanced editor: We update the file editor to include code block contexts with line numbers, significantly reducing syntax errors and indentation issues that frequently occur during code generation.
5. Runtime debugging: Utilizing LLMs, we provide tools line-by-line code explanations based on runtime tracebacks (e.g., assertion errors) or regression test results. We enable runtime debugging by displaying variable values and generating in-line comments for executed code, significantly improving code generation quality through an iterative debugging process with LLMs.

![img](./images/3.png)
Expand Down

0 comments on commit 7b4f466

Please sign in to comment.