Feature Configuration from JSON

Overview

The load_features_from_config function enables loading feature configurations from JSON strings. This is the primary interface for AI agents and LLMs to request data from mloda - agents generate JSON, mloda executes it.

Use cases:

  • LLM Tool Functions - LLMs generate JSON feature requests without writing Python code
  • Feature configurations stored externally (files, databases, APIs)
  • Dynamic feature definitions at runtime
  • Configuration-driven pipelines

Basic Usage

from mloda.user import load_features_from_config, mloda

config = '''
[
    "simple_feature",
    {"name": "configured_feature", "options": {"param": "value"}}
]
'''

features = load_features_from_config(config)
result = mloda.run_all(features, compute_frameworks=["PandasDataFrame"])

JSON Format

The configuration must be a JSON array. Each item can be:

1. Simple String

A plain feature name string:

["feature_name"]

2. Feature Object

An object with name and optional configuration:

[
    {
        "name": "feature_name",
        "options": {"key": "value"}
    }
]

3. Mixed Configuration

Combine strings and objects:

[
    "simple_feature",
    {"name": "configured_feature", "in_features": ["source_feature"]}
]

FeatureConfig Fields

Field Type Required Description
name string Yes Feature name
options object No Simple options dict (cannot be used with group_options/context_options)
in_features array No Source feature names for chained features
group_options object No Group parameters (affect Feature Group resolution)
context_options object No Context parameters (metadata, doesn't affect resolution)
column_index integer No Index for multi-output features (adds ~N suffix)

Configuration Approaches

Simple Options

Use options for simple key-value configuration:

[
    {
        "name": "my_feature",
        "options": {
            "window_size": 7,
            "aggregation": "sum"
        }
    }
]

Modern Group/Context Options

For explicit separation of group and context parameters:

[
    {
        "name": "my_feature",
        "group_options": {
            "data_source": "production"
        },
        "context_options": {
            "aggregation_type": "sum"
        }
    }
]

Note: options and group_options/context_options are mutually exclusive.

Feature Chaining with in_features

Define dependent features using in_features:

[
    {
        "name": "aggregated_sales",
        "in_features": ["raw_sales"],
        "context_options": {
            "aggregation_type": "sum"
        }
    }
]

Multiple source features:

[
    {
        "name": "distance_feature",
        "in_features": ["point_a", "point_b"]
    }
]

Multi-Column Features

Access specific columns from multi-output features using column_index:

[
    {
        "name": "pca_result",
        "column_index": 0
    }
]

This produces a feature named pca_result~0.

Complete Example

from mloda.user import load_features_from_config, mloda

config = '''
[
    "customer_id",
    {
        "name": "sales_aggregated",
        "in_features": ["daily_sales"],
        "context_options": {
            "aggregation_type": "sum",
            "window_days": 7
        }
    },
    {
        "name": "encoded_category",
        "in_features": ["category"],
        "column_index": 0
    }
]
'''

features = load_features_from_config(config)

result = mloda.run_all(
    features,
    compute_frameworks=["PandasDataFrame"],
    api_data={"customer_data": {"customer_id": [1, 2, 3]}}
)