Feature Group Matching Criteria
Overview
The mloda framework uses a sophisticated matching system to determine which feature group should handle a given feature. The modern approach supports both traditional string-based matching and configuration-based matching through the unified FeatureChainParser
.
Matching Process
When a feature is requested, the system checks all available feature groups to find the one that should handle the feature. This is done through the match_feature_group_criteria
method in each feature group, which now typically uses the unified parser approach.
Modern Unified Matching
The recommended approach uses FeatureChainParser.match_configuration_feature_chain_parser
which provides:
1. Dual Approach Support
from mloda_plugins.feature_group.experimental.default_options_key import DefaultOptionKeys
from mloda_core.abstract_plugins.components.options import Options
@classmethod
def match_feature_group_criteria(cls, feature_name, options, data_access_collection=None):
return FeatureChainParser.match_configuration_feature_chain_parser(
feature_name,
options,
property_mapping=cls.PROPERTY_MAPPING, # Configuration-based matching
pattern=cls.PATTERN, # String-based matching
prefix_patterns=cls.PREFIX_PATTERN
)
2. PROPERTY_MAPPING Configuration
The PROPERTY_MAPPING
defines how configuration-based features are validated:
PROPERTY_MAPPING = {
"aggregation_type": {
"sum": "Sum aggregation",
"avg": "Average aggregation",
"max": "Maximum aggregation",
DefaultOptionKeys.mloda_context: True,
DefaultOptionKeys.mloda_strict_validation: True,
},
DefaultOptionKeys.mloda_source_feature: {
"explanation": "Source feature for aggregation",
DefaultOptionKeys.mloda_context: True,
DefaultOptionKeys.mloda_strict_validation: False,
},
}
3. Validation Modes
Strict Validation
When mloda_strict_validation: True
, parameter values must be in the mapping:
# This will match
options = Options(context={"aggregation_type": "sum"}) # "sum" is in mapping
# This will fail validation
options = Options(context={"aggregation_type": "custom"}) # "custom" not in mapping
Flexible Validation
When mloda_strict_validation: False
(default), any value is accepted:
# Both will match
options = Options(context={"mloda_source_feature": "sales"}) # Any value OK
options = Options(context={"mloda_source_feature": "custom_feature"}) # Any value OK
Custom Validation Functions
For complex validation beyond simple value lists:
PROPERTY_MAPPING = {
"window_size": {
"explanation": "Size of the time window",
DefaultOptionKeys.mloda_context: True,
DefaultOptionKeys.mloda_strict_validation: True,
DefaultOptionKeys.mloda_validation_function: lambda x: isinstance(x, int) and x > 0,
},
}
Legacy Default Matching Criteria
For feature groups not yet modernized, the default matching criteria still apply:
-
Root Feature with Matching Input Data: The feature group is a root feature (has no dependencies) and its input data matches the feature.
-
Class Name Match: The feature name exactly matches the feature group's class name.
python feature_name == FeatureGroup.get_class_name()
-
Prefix Match: The feature name starts with the feature group's class name as a prefix.
python feature_name.startswith(FeatureGroup.prefix()) # Default prefix is "ClassName_"
-
Explicitly Supported: The feature name is in the set of explicitly supported feature names.
python feature_name in FeatureGroup.feature_names_supported()
Matching Examples
Modern Feature Group (Aggregation)
# String-based matching
feature = Feature("sum_aggr__sales") # Matches via pattern
# Configuration-based matching
feature = Feature(
"placeholder",
Options(context={
"aggregation_type": "sum",
"mloda_source_feature": "sales"
})
) # Matches via PROPERTY_MAPPING validation
Parameter Classification Impact
The group/context parameter separation affects matching behavior:
# These create different Feature Group instances (different group parameters)
feature1 = Feature("placeholder", Options(
group={"data_source": "production"},
context={"aggregation_type": "sum", "mloda_source_feature": "sales"}
))
feature2 = Feature("placeholder", Options(
group={"data_source": "staging"}, # Different group parameter
context={"aggregation_type": "sum", "mloda_source_feature": "sales"}
))
# These create the same Feature Group instance (same group, different context)
feature3 = Feature("placeholder", Options(
group={"data_source": "production"},
context={"aggregation_type": "sum", "mloda_source_feature": "sales"}
))
feature4 = Feature("placeholder", Options(
group={"data_source": "production"}, # Same group parameter
context={"aggregation_type": "avg", "mloda_source_feature": "revenue"} # Different context
))
Migration Path
When modernizing a feature group:
- Add PROPERTY_MAPPING with parameter definitions
- Update match_feature_group_criteria to use unified parser
- Classify parameters as group vs context appropriately
- Test both approaches work correctly
- Update documentation and examples