Luminaire Streaming Anomaly Detection Models: Window Density Model¶
- class luminaire.model.window_density.WindowDensityHyperParams(freq=None, max_missing_train_prop=0.1, is_log_transformed=False, baseline_type='aggregated', detection_method=None, min_window_length=None, max_window_length=None, window_length=None, detrend_method='modeling')[source]¶
Hyperparameter class for Luminaire Window density model.
- Parameters:
freq (str) – The frequency of the time-series. Luminaire supports default configuration for ‘S’, T, ‘15T’, ‘H’, ‘D’. Any other frequency type should be specified as ‘custom’ and configuration should be set manually.
max_missing_train_prop (float) – Maximum proportion of missing observation allowed in the training data.
is_log_transformed (bool) – A flag to specify whether to take a log transform of the input data. If the data contain negatives, is_log_transformed is ignored even though it is set to True.
baseline_type (str) –
A string flag to specify whether to take set a baseline as the previous sub-window from the training data for scoring or to aggregate the overall window as a baseline. Possible values:
”last_window”
”aggregated”
detection_method (str) –
A string that select between two window testing method. Possible values:
”kldiv” (KL-divergence). This is recommended to be set for high frequency time series such as ‘S’, ‘T’ etc.
”sign_test” (Wilcoxon sign rank test). This is recommended to be set for low frequency time series such as ‘H’, ‘D’ etc.
min_window_length (int) –
Minimum size of the scoring window / a stable training sub-window length.
Note
This is not the minimum size of the whole training window which is the combination of stable sub-windows.
max_window_length (int) –
Maximum size of the scoring window / a stable training sub-window length.
Note
This is not the maximum size of the whole training window which is the combination of stable sub-windows.
window_length (int) –
Size of the scoring window / a stable training sub-window length.
Note
This is not the size of the whole training window which is the combination of stable sub-windows.
detrend_method (str) –
A string that select between two stationarizing method. Possible values:
”ma” (moving average based)
”diff” (differencing based).
- class luminaire.model.window_density.WindowDensityModel(hyper_params: {'freq': None, 'max_missing_train_prop': 0.1, 'is_log_transformed': False, 'baseline_type': 'aggregated', 'detection_method': None, 'min_window_length': None, 'max_window_length': None, 'window_length': None, 'detrend_method': 'modeling'}, **kwargs)[source]¶
This model detects anomalous windows using KL divergence (for high frequency data) and Wilcoxon sign rank test (for low frequency data). This default monitoring frequency is set to pandas time frequency type ‘T’.
- Parameters:
hyper_params (dict) – Hyper parameters for Luminaire window density model. See
luminaire.model.window_density.WindowDensityHyperParams
for detailed information.- Returns:
Anomaly probability for the execution window and other related model outputs
- Return type:
list[dict]
- score(data, **kwargs)[source]¶
Function scores input series for anomalies
- Parameters:
data (pandas.DataFrame) – Input time series to score
- Returns:
Output dictionary with scoring summary.
- Return type:
dict
>>> data raw interpolated index 2018-10-11 00:00:00 204800 204800 2018-10-11 01:00:00 222218 222218 2018-10-11 02:00:00 218903 218903 2018-10-11 03:00:00 190639 190639 2018-10-11 04:00:00 148214 148214 2018-10-11 05:00:00 106358 106358 2018-10-11 06:00:00 70081 70081 2018-10-11 07:00:00 47748 47748 2018-10-11 08:00:00 36837 36837 2018-10-11 09:00:00 33023 33023 2018-10-11 10:00:00 44432 44432 2018-10-11 11:00:00 72773 72773 2018-10-11 12:00:00 115180 115180 2018-10-11 13:00:00 157568 157568 2018-10-11 14:00:00 180174 180174 2018-10-11 15:00:00 190048 190048 2018-10-11 16:00:00 188391 188391 2018-10-11 17:00:00 189233 189233 2018-10-11 18:00:00 191703 191703 2018-10-11 19:00:00 189848 189848 2018-10-11 20:00:00 192685 192685 2018-10-11 21:00:00 196743 196743 2018-10-11 22:00:00 193016 193016 2018-10-11 23:00:00 196441 196441 >>> model <luminaire.model.window_density.WindowDensityModel object at 0x7fcaab72fdd8>
>>> model.score(data) {'Success': True, 'ConfLevel': 99.9, 'IsAnomaly': False, 'AnomalyProbability': 0.6963188902776808}
- train(data, **kwargs)[source]¶
Input time series for training.
- Parameters:
data (pandas.DataFrame) – Input time series.
- Returns:
Trained model with the training timestamp and a success flag
- Return type:
tuple(bool, str, python model object)
>>> data raw interpolated index 2017-10-02 00:00:00 118870 118870 2017-10-02 01:00:00 121914 121914 2017-10-02 02:00:00 116097 116097 2017-10-02 03:00:00 94511 94511 2017-10-02 04:00:00 68330 68330 ... ... ... 2018-10-10 19:00:00 219908 219908 2018-10-10 20:00:00 219149 219149 2018-10-10 21:00:00 207232 207232 2018-10-10 22:00:00 198741 198741 2018-10-10 23:00:00 213751 213751 >>> hyper_params = WindowDensityHyperParams(freq='H').params >>> wdm_obj = WindowDensityModel(hyper_params=hyper_params) >>> success, model = wdm_obj.train(data)
>>> success, model (True, "2018-10-10 23:00:00", <luminaire.model.window_density.WindowDensityModel object at 0x7fd7c5a34e80>)