mloda
Declarative Data Access for AI Agents
Describe what you need - mloda delivers it.
mloda provides declarative data access for AI agents, ML pipelines, and data teams. Instead of writing complex data retrieval code, users describe WHAT they need - mloda resolves HOW to get it through its plugin system.
mloda's plugin system automatically selects the right plugins for each task, enabling efficient querying and processing of complex features. Learn more about the mloda API here. By defining feature dependencies, transformations, and metadata processes, mloda minimizes duplication and fosters reusability.
Plugins are small, template-like structures - easy to test, easy to debug, and AI-friendly. Let AI generate plugins for you, or share them across projects, teams, and organizations.
Key Benefits
AI & Agent Integration
- declarative API - agents describe WHAT, not HOW
- JSON-based feature requests for LLM tool functions
- built-in lineage for traceability
Data Processing
- automated feature engineering and dependency resolution
- data cleaning and synthetic data generation
Data Management
- rich metadata including data lineage and usage tracking
- clear role separation: providers, users, and stewards
Data Quality and Security
- data quality definitions
- unit- and integration tests
- secure queries
Scalability
- switch compute framework without changing feature logic
- same plugins work from notebook to production
Community Engagement by Design
- shareable plugin ecosystem
- fostering community
Core Components and Architecture
mloda addresses common challenges in data and feature engineering by leveraging two key components:
Plugins
-
Feature Groups: Define feature dependencies, such as creating a composite label based on features e.g. user activity, purchase history, and support interactions. Once defined, only the label needs to be requested, as dependencies are resolved automatically, simplifying processing. Learn more here.
-
Compute Frameworks: Defines the technology stack, like Spark or Pandas, along with support for different storage engines such as Parquet, Delta Lake, or PostgreSQL, to execute feature transformations and computations, ensuring efficient processing at scale. Learn more here.
-
Extenders: Automates metadata extraction processes, helping you enhance data governance, compliance, and traceability, such as analyzing how often features are used by models or analysts, or understanding where the data is coming from. Learn more here.
Core
- Core Engine: Handles dependencies between features and computations by coordinating linking, joining, filtering, and ordering operations to ensure optimized data processing. For example, in customer segmentation, the core engine would link and filter different data sources, such as demographics, purchasing history, and online behavior, to create relevant features.
Contributing to mloda
We welcome contributions from the community to help us improve and expand mloda. Whether you're interested in developing plugins or adding new features, your input is invaluable. Learn more here.
Frequently Asked Questions (FAQ)
If you have additional questions about mloda visit our FAQ section, raise an issue on our GitHub repository, or email us at info@mloda.ai.
License
This project is licensed under the Apache License, Version 2.0.