Abstract Components Overview and Relationships for ChemicalX library.
Components
Data Management & Preparation
This component is responsible for the entire lifecycle of data, from loading raw chemical and biological datasets (drug features, context features, interaction triples) from various sources (remote/local) to structuring and transforming them into optimized batches (`DrugPairBatch`) suitable for efficient model consumption during training and inference. It manages feature sets and ensures data integrity and flow, including utility functions for data handling.
Model Core & Architectures
This component defines the foundational abstract interface (`chemicalx.models.base.Model`) for all deep learning models within the library. It also includes the concrete implementations of various high-level model architectures (e.g., `CASTER`, `DeepDDI`, `MHCADDI`, `GCNBMP`, `SSIDDI`) tailored for specific tasks like drug-drug interaction or synergy prediction, encapsulating their overall neural network designs.
Training & Evaluation Pipeline
This component orchestrates the end-to-end workflow for training, validating, and evaluating deep learning models. It manages the training loop, handles device placement (CPU/GPU), and processes the results generated by the models, providing a structured and reproducible way to run experiments.
Utilities
This component offers a collection of general-purpose helper functions and mathematical operations that support various parts of the library. This includes tensor manipulation functions (e.g., segment operations for sparse tensors) and system-level utilities like device resolution.