The Auto-Code-Rover (ACR) project is structured around a core APR Orchestrator that drives the automated program repair process. This orchestrator, primarily managed by the app.main.main function, initiates and coordinates tasks across specialized Agent Layer components. The Agent Layer comprises various agents (e.g., agent_reproducer, agent_search, agent_write_patch, agent_reviewer, agent_select) that handle distinct phases of bug reproduction, code analysis, patch generation, and selection. These agents heavily rely on the LLM Integration Service (modules like app.model.common, app.model.gpt, app.inference) to interact with different Large Language Models for code generation and analysis tasks. For fault localization and patch validation, the Agent Layer interacts with the Code Analysis & Validation Module, which includes functionalities for SBFL (app.analysis.sbfl) and various validation mechanisms (app.api.validation, app.api.swe_bench_docker_validation). Throughout the entire workflow, a central Configuration & Data Models component (app.config, app.data_structures) provides essential settings, LLM configurations, and defines the data structures used for consistent information exchange across all modules. This modular design ensures clear separation of concerns, facilitating maintainability and extensibility, and enabling a streamlined data flow from task initiation to validated patch generation.
Components
APR Orchestrator
The central control unit managing the entire Automated Program Repair (APR) workflow. It orchestrates the sequence of operations, coordinates interactions between different agents, and drives the overall repair process from bug reproduction to patch selection.
Referenced Source Code
Agent Layer
A collection of specialized agents responsible for distinct phases of the bug repair process. This layer encapsulates the intelligence and operational logic for bug reproduction, code search, patch generation, and patch review/selection. It also includes common utilities shared across agents.
Referenced Source Code
LLM Integration Service
Provides a unified, standardized abstraction layer for interacting with various Large Language Models (LLMs) such as OpenAI, Anthropic, Groq, and Llama. It manages prompt formatting, sends requests to different LLM providers, and processes their generated responses.
Referenced Source Code
Code Analysis & Validation Module
Responsible for analyzing the codebase to identify potential locations where a bug might reside (fault localization) and for rigorously testing and validating generated patches. This includes running tests, evaluating their impact, and ensuring the correctness and effectiveness of the proposed fixes.
Referenced Source Code
Configuration & Data Models
A foundational component that centralizes the management of application settings, LLM configurations, environment variables, and defines the core data structures used throughout the system. This ensures consistency and easy management of system-wide parameters and information representation.