PROPERTY_MAPPING Configuration
Overview
PROPERTY_MAPPING defines parameter validation and classification for modern feature groups using the unified parser approach.
Basic Structure
from mloda.provider import DefaultOptionKeys
PROPERTY_MAPPING = {
"parameter_name": {
"value1": "Description of value1",
"value2": "Description of value2",
DefaultOptionKeys.context: True, # Parameter classification
DefaultOptionKeys.strict_validation: True, # Validation mode
},
DefaultOptionKeys.in_features: {
"explanation": "Source feature description",
DefaultOptionKeys.context: True,
DefaultOptionKeys.strict_validation: False, # Flexible validation
},
}
Parameter Classification
# Context parameter (doesn't affect Feature Group splitting)
"aggregation_type": {
"sum": "Sum aggregation",
DefaultOptionKeys.context: True,
}
# Group parameter (affects Feature Group splitting)
"data_source": {
"production": "Production data",
DefaultOptionKeys.group: True,
}
# Order-by parameter (defines sort order for sequential operations)
DefaultOptionKeys.order_by: {
"explanation": "Column(s) controlling row order for rank, offset, or frame_aggregate",
DefaultOptionKeys.context: True,
DefaultOptionKeys.strict_validation: False,
}
Validation Modes
Strict Validation (Default: False)
"algorithm_type": {
"kmeans": "K-means clustering",
"dbscan": "DBSCAN clustering",
DefaultOptionKeys.strict_validation: True, # Only listed values allowed
}
Custom Validation Functions
Use validation_function with strict_validation=True to validate individual
parsed elements. The parser unpacks lists and calls the function on each element:
"window_size": {
"explanation": "Size of time window",
DefaultOptionKeys.validation_function: lambda x: isinstance(x, int) and x > 0,
DefaultOptionKeys.strict_validation: True,
}
Type Validators
Use type_validator to validate the shape or composite type of the raw option
value before any list unpacking. Unlike validation_function, it does not
require strict_validation. The validator receives the value exactly as stored
in Options and must return a truthy value for the match to succeed:
def _is_list_of_strings(value):
return isinstance(value, list) and all(isinstance(item, str) for item in value)
"partition_by": {
"explanation": "List of columns to partition by",
DefaultOptionKeys.context: True,
DefaultOptionKeys.strict_validation: False,
DefaultOptionKeys.type_validator: _is_list_of_strings,
}
When both validation_function and type_validator are present on the same
entry, validation_function runs first (during property mapping validation on
each parsed element), then type_validator runs on the raw value. Validators
must be pure functions with no side effects.
Default Values
"method": {
"linear": "Linear interpolation",
"cubic": "Cubic interpolation",
DefaultOptionKeys.default: "linear", # Default if not specified
}
Usage in Feature Groups
class MyFeatureGroup(FeatureGroup):
PROPERTY_MAPPING = {
"operation_type": {
"sum": "Sum operation",
"avg": "Average operation",
DefaultOptionKeys.context: True,
DefaultOptionKeys.strict_validation: True,
},
DefaultOptionKeys.in_features: {
"explanation": "Source feature",
DefaultOptionKeys.context: True,
},
}
@classmethod
def match_feature_group_criteria(cls, feature_name, options, data_access_collection=None):
return FeatureChainParser.match_configuration_feature_chain_parser(
feature_name, options, property_mapping=cls.PROPERTY_MAPPING
)
Validation Examples
# Valid - "sum" is in mapping
Options(context={"operation_type": "sum"})
# Invalid with strict validation - "custom" not in mapping
Options(context={"operation_type": "custom"}) # Raises ValueError
# Valid with flexible validation - any value allowed
Options(context={"in_features": "any_feature_name"})
Conditional Requirements with required_when
Use DefaultOptionKeys.required_when to declare options that are only required under certain conditions. Attach a predicate callable to a PROPERTY_MAPPING entry. The predicate receives an effective Options object and returns True when the option is required.
This works for both configuration-based and string-based feature creation. For string-based features, the operation value parsed from the feature name is merged into the effective options before predicate evaluation, so predicates see values from both sources.
from mloda.core.abstract_plugins.components.options import Options
from mloda.provider import DefaultOptionKeys
_ORDER_DEPENDENT = {"first", "last"}
def _needs_order_by(options: Options) -> bool:
return options.get("aggregation_type") in _ORDER_DEPENDENT
PROPERTY_MAPPING = {
"aggregation_type": {
"sum": "Sum", "avg": "Average", "first": "First", "last": "Last",
DefaultOptionKeys.context: True,
DefaultOptionKeys.strict_validation: True,
},
"order_by": {
"explanation": "Column to order by within each partition",
DefaultOptionKeys.context: True,
DefaultOptionKeys.strict_validation: False,
DefaultOptionKeys.required_when: _needs_order_by,
},
}
When the predicate returns True and the option is absent, match_feature_group_criteria returns False. When the predicate returns False, the option is not required. Entries with required_when are treated as optional by the base parser.
Predicate Contract
The predicate must satisfy:
- Signature:
(Options) -> bool - Must be callable. Non-callable values are skipped with a warning log.
- Must not raise exceptions. Exceptions from predicates propagate uncaught.
- Must be a pure function (no side effects).
- Non-bool truthy return values are treated as
True.
Context Propagation
By default, context parameters are local: they do not flow through feature dependency chains. This is correct for feature-specific config like aggregation types.
For cross-cutting metadata (session IDs, environment flags) that should flow through chains, use propagate_context_keys on Options:
Options(
context={"session_id": "abc", "window_function": "sum"},
propagate_context_keys=frozenset({"session_id"}), # only session_id flows to dependents
)
Only the specified keys propagate. Everything else stays local. Group propagation is unchanged.