The system's architecture is centered around processing code repositories to create an intelligent, queryable wiki. The API Gateway / Wiki Manager acts as the primary interface, handling external requests for wiki management, project metadata, and model configurations. It orchestrates the Data Ingestion & Processing component to download, parse, and embed repository content, which is then stored in a Vector Database. The RAG Core leverages this stored data, retrieving relevant information and employing a Language Model to generate AI-driven answers to user queries, which are then exposed back through the API Gateway / Wiki Manager. This design ensures a clear separation of concerns, with dedicated components for data handling, intelligent retrieval, and external interaction.
Components
Data Ingestion & Processing
Manages the entire lifecycle of data ingestion. This involves downloading repositories, reading and transforming source documents, persisting them into the vector database for efficient retrieval, and indexing generated wiki content and project metadata. It also handles file content retrieval from various source control systems. This component is responsible for the *persistence* and *initial indexing* of data relevant to the wiki.
Referenced Source Code
RAG Core
Orchestrates the core Retrieval Augmented Generation (RAG) process. This includes managing conversational context, initializing the database manager, and retrieving relevant information from the vector database, then generating AI-driven answers by combining this retrieved information with LLM capabilities. This component is primarily responsible for the *retrieval* of vector embeddings and relevant data from the underlying storage.
Referenced Source Code
API Gateway / Wiki Manager
Serves as the primary interface for external clients to interact with the wiki system. It exposes RESTful endpoints for operations such as exporting wiki content, managing wiki caches (reading, saving, deleting), retrieving generated wiki content, and accessing project metadata. It also lists processed projects and retrieves model configurations. This component handles the *retrieval* of generated wiki content and project metadata, as well as *managing the persistence* of wiki caches.
Referenced Source Code
Vector Database
Stores and manages the vector embeddings and associated metadata of the processed documents. It provides efficient retrieval mechanisms for the RAG Core to fetch relevant information based on semantic similarity.