Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Add support for per-partition categorization jobs #74592

Closed
wants to merge 36 commits into from
Closed
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
5cb13f5
[ML] Add option to add per-partition categorization for cat jobs
qn895 Aug 3, 2020
6a3e91f
[ML] Add per partition endpoints for categorizer_stats and stopped_pa…
qn895 Aug 6, 2020
283db47
[ML] Add callout on AE page if there's a stopped partition
qn895 Aug 6, 2020
9b513cf
[ML] Add schema to get stopped_partitions for multiple jobs
qn895 Aug 6, 2020
10746ca
[ML] Add support for per-partition advanced categorization wizard
qn895 Aug 6, 2020
ae61c2a
[ML] Remove unused getStoppedPartitionsSchema
qn895 Aug 6, 2020
ad453e3
[ML] Fix so if cat detector changes type it will accommodate partitio…
qn895 Aug 6, 2020
cbb857c
[ML] Update query size for stopped_partition to 0 since it's unused
qn895 Aug 6, 2020
5aaf226
[ML] Fix random anomalyExplorerLabel
qn895 Aug 6, 2020
32a5ded
Merge remote-tracking branch 'upstream/master' into categorization-pe…
qn895 Aug 10, 2020
1591562
[ML] Move logic to results service from anomaly_detectors
qn895 Aug 10, 2020
a35ec15
[ML] Fix untranslated strings
qn895 Aug 10, 2020
abe77e7
[ML] Refactor core logic to Results model
qn895 Aug 10, 2020
f0a3cf4
[ML] Update callAsInternalUser and text descriptions
qn895 Aug 10, 2020
5ddeaab
[ML] Update callAsInternalUser and text descriptions, refactor to sup…
qn895 Aug 10, 2020
98bb431
[ML] Update callAsInternalUser and text descriptions, refactor to sup…
qn895 Aug 10, 2020
9ca076b
[ML] Update namings and expect for partition field
qn895 Aug 11, 2020
ddb2fd8
[ML] Change job_id and partition_field to constants, refactor shape
qn895 Aug 11, 2020
bb698b6
[ML] Update tests to reflect API changes
qn895 Aug 11, 2020
1f3c5ec
Merge remote-tracking branch 'upstream/master' into categorization-pe…
qn895 Aug 11, 2020
e796419
[ML] init tabs
darnautov Aug 11, 2020
02e17b3
[ML] init inference API service in UI
darnautov Aug 12, 2020
6620328
[ML] server-side routes
darnautov Aug 12, 2020
646ef99
[ML] basic table
darnautov Aug 12, 2020
a94ecaf
[ML] Update naming and refactor context
qn895 Aug 12, 2020
621d3c5
[ML] Change i18n to FormattedMessage
qn895 Aug 12, 2020
0a813e1
[ML] Remove this._partitionField
qn895 Aug 12, 2020
3686b73
[ML] Fix type errors
qn895 Aug 12, 2020
c5a56ca
[ML] support deletion
darnautov Aug 13, 2020
3801a38
[ML] delete multiple models
darnautov Aug 13, 2020
dd2f28c
[ML] WIP expanded row
darnautov Aug 13, 2020
96e6919
[ML] fix types
darnautov Aug 13, 2020
976b46a
[ML] expanded row
darnautov Aug 13, 2020
043f4f2
[ML] fix types
darnautov Aug 13, 2020
b48cda6
[ML] Change advanced wizard to not have the dropdown per partition field
qn895 Aug 13, 2020
5298fb3
Merge remote-tracking branch 'upstream/master' into categorization-pe…
qn895 Aug 13, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions x-pack/plugins/ml/common/types/anomalies.ts
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,20 @@ export interface AnomaliesTableRecord {
}

export type PartitionFieldsType = typeof PARTITION_FIELDS[number];

export interface AnomalyCategorizerStatsDoc {
[key: string]: any;
job_id: string;
result_type: 'categorizer_stats';
partition_field_name?: string;
partition_field_value?: string;
categorized_doc_count: number;
total_category_count: number;
frequent_category_count: number;
rare_category_count: number;
dead_category_count: number;
failed_category_count: number;
categorization_status: 'ok' | 'warn';
log_time: number;
timestamp: number;
}
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,6 @@ export interface CustomRule {
}

export interface PerPartitionCategorization {
enabled: boolean;
enabled?: boolean;
stop_on_warn?: boolean;
}
18 changes: 17 additions & 1 deletion x-pack/plugins/ml/public/application/explorer/explorer.js
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ import {
EuiPanel,
EuiAccordion,
EuiBadge,
EuiCallOut,
} from '@elastic/eui';

import { AnnotationFlyout } from '../components/annotations/annotation_flyout';
Expand Down Expand Up @@ -204,7 +205,7 @@ export class Explorer extends React.Component {
updateLanguage = (language) => this.setState({ language });

render() {
const { showCharts, severity } = this.props;
const { showCharts, severity, stoppedPartitions } = this.props;

const {
annotations,
Expand Down Expand Up @@ -297,6 +298,21 @@ export class Explorer extends React.Component {

<div className={mainColumnClasses}>
<EuiSpacer size="m" />

{stoppedPartitions && (
<EuiCallOut
size={'s'}
title={i18n.translate('xpack.ml.explorer.stoppedPartitionsExistCallout', {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's recommended to use { FormattedMessage } from '@kbn/i18n/react'; inside of components instead of i18n.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated all the i18n.translate to using FormattedMessage here 3686b73

defaultMessage:
'There may be fewer results than there could have been because stop_on_warn is turned on. Both categorization and subsequent anomaly detection have stopped for some partitions in {jobsWithStoppedPartitions, plural, one {job} other {jobs}} [{stoppedPartitions}] where the categorization status has changed to warn.',
values: {
jobsWithStoppedPartitions: stoppedPartitions.length,
stoppedPartitions: stoppedPartitions.join(', '),
},
})}
/>
)}

<AnomalyTimeline
explorerState={this.props.explorerState}
setSelectedCells={this.props.setSelectedCells}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,11 @@ export class CategorizationJobCreator extends JobCreator {
private _createDetector(agg: Aggregation, field: Field) {
const dtr: Detector = createBasicDetector(agg, field);
dtr.by_field_name = mlCategory.id;

// API requires if per_partition_categorization is enabled, add partition field to the detector
if (this.perPartitionCategorization && this.categorizationPerPartitionField !== null) {
dtr.partition_field_name = this.categorizationPerPartitionField;
}
this._addDetector(dtr, agg, mlCategory);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ export class JobCreator {
private _stopAllRefreshPolls: {
stop: boolean;
} = { stop: false };
private _partitionField: string | null;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: use

Suggested change
private _partitionField: string | null;
private _partitionField: string | null = null;

and there is no need to defined it in the constructor.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussing with James, we decided to remove this _partitionField altogether here 0a813e1


protected _wizardInitialized$ = new BehaviorSubject<boolean>(false);
public wizardInitialized$ = this._wizardInitialized$.asObservable();
Expand All @@ -81,6 +82,7 @@ export class JobCreator {
}

this._datafeed_config.query = query;
this._partitionField = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keeping a separate copy of the partition field name will lead to this becoming out of sync with the real partition field name in the job config, if the user edits the job's JSON.
i think we should be using the partition field from the detector which uses the mlcategory field.

this will make the get categorizationPerPartitionField() function trickier as it'll have to find that field each time it is called.
Perhaps a workaround would be to store the index of the categorisation detector in this class for fast look up. in the categorisation wizard that will always be 0.

I don't see how we can deal with multiple categorisation detectors with differing partition fields in the UI. Even though it is possible to configure in the JSON.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As from our conversation, I have removed this._partitionField altogether. Since storing the index might lead to the same issue where the index being out of sync with the detectors (e.g. like deleting one of the detector), I have changed the getter function to find the first instance of detector where the keyword mlcategory exists. Since the partition_field_name property has to be the same value in every detector that uses the keyword mlcategory, I think this change is more suitable.

}

public get type(): JOB_TYPE {
Expand Down Expand Up @@ -622,6 +624,56 @@ export class JobCreator {
return JSON.stringify(this._datafeed_config, null, 2);
}

private _initPerPartitionCategorization() {
if (this._job_config.analysis_config.per_partition_categorization === undefined) {
this._job_config.analysis_config.per_partition_categorization = {};
}
if (this._job_config.analysis_config.per_partition_categorization?.enabled === undefined) {
this._job_config.analysis_config.per_partition_categorization!.enabled = false;
}
if (this._job_config.analysis_config.per_partition_categorization?.stop_on_warn === undefined) {
this._job_config.analysis_config.per_partition_categorization!.stop_on_warn = false;
}
}

public get perPartitionCategorization() {
return this._job_config.analysis_config.per_partition_categorization?.enabled === true;
}

public set perPartitionCategorization(enabled: boolean) {
this._initPerPartitionCategorization();
this._job_config.analysis_config.per_partition_categorization!.enabled = enabled;
}

public get partitionStopOnWarn() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this name be perPartitionStopOnWarn ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here a94ecaf

return this._job_config.analysis_config.per_partition_categorization?.stop_on_warn === true;
}

public set partitionStopOnWarn(enabled: boolean) {
this._initPerPartitionCategorization();
this._job_config.analysis_config.per_partition_categorization!.stop_on_warn = enabled;
}

public get categorizationPerPartitionField() {
return this._partitionField;
}

public set categorizationPerPartitionField(fieldName: string | null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should only be changing detectors which are using mlcategory as the by_field_name

if (fieldName === null) {
this._detectors.forEach((detector) => {
delete detector.partition_field_name;
});
this._partitionField = null;
} else {
if (this._partitionField !== fieldName) {
this._partitionField = fieldName;
this._detectors.forEach((detector) => {
detector.partition_field_name = fieldName;
});
}
}
}

protected _overrideConfigs(job: Job, datafeed: Datafeed) {
this._job_config = job;
this._datafeed_config = datafeed;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,19 @@
* you may not use this file except in compliance with the Elastic License.
*/

import React, { Fragment, FC } from 'react';
import React, { Fragment, FC, useContext } from 'react';
import { EuiFlexGroup, EuiFlexItem } from '@elastic/eui';

import { SummaryCountField } from '../summary_count_field';
import { CategorizationField } from '../categorization_field';
import { CategorizationPerPartitionField } from '../categorization_partition_field';
import { JobCreatorContext } from '../../../job_creator_context';
import { isAdvancedJobCreator } from '../../../../../common/job_creator';

export const ExtraSettings: FC = () => {
const { jobCreator } = useContext(JobCreatorContext);
const showCategorizationPerPartitionField =
isAdvancedJobCreator(jobCreator) && jobCreator.categorizationFieldName !== null;
return (
<Fragment>
<EuiFlexGroup gutterSize="xl">
Expand All @@ -21,6 +27,13 @@ export const ExtraSettings: FC = () => {
<SummaryCountField />
</EuiFlexItem>
</EuiFlexGroup>
{showCategorizationPerPartitionField && (
<EuiFlexGroup gutterSize="xl">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need a flex group just for one item?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed here a94ecaf

<EuiFlexItem>
<CategorizationPerPartitionField />
</EuiFlexItem>
</EuiFlexGroup>
)}
</Fragment>
);
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/

import React, { FC, useContext, useEffect, useState } from 'react';
import { EuiFormRow } from '@elastic/eui';
import { i18n } from '@kbn/i18n';
import { JobCreatorContext } from '../../../job_creator_context';
import {
AdvancedJobCreator,
CategorizationJobCreator,
isCategorizationJobCreator,
} from '../../../../../common/job_creator';
import { newJobCapsService } from '../../../../../../../services/new_job_capabilities_service';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid using static imports for services. Put them in a context instead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree this would be better in a context, however newJobCapsService is used throughout the wizards and so i'd suggest this happens in a refactoring PR later on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll save this issue for a follow up PR like James mentioned

import { Description } from './description';
import { CategorizationPerPartitionSwitch } from './categorization_per_partition_switch';
import { CategorizationPerPartitionFieldSelect } from './categorization_per_partition_input';
import { CategorizationPerPartitionStopOnWarnSwitch } from './categorization_stop_on_warn_switch';

export const CategorizationPerPartitionField: FC = () => {
const { jobCreator: jc, jobCreatorUpdate, jobCreatorUpdated } = useContext(JobCreatorContext);
const jobCreator = jc as AdvancedJobCreator | CategorizationJobCreator;
const [enablePerPartitionCategorization, setEnablePerPartitionCategorization] = useState(false);
const [categorizationPartitionFieldName, setCategorizationPartitionFieldName] = useState<
string | null
>(jobCreator.categorizationPerPartitionField);

const { catFields } = newJobCapsService;

const filteredCategories = catFields.filter((c) => c.id !== jobCreator.categorizationFieldName);

useEffect(() => {
jobCreator.categorizationPerPartitionField = categorizationPartitionFieldName;
jobCreatorUpdate();
}, [categorizationPartitionFieldName]);

useEffect(() => {
// set the first item in category as partition field by default
// because API requires partition_field to be defined in each detector with mlcategory
// if per-partition categorization is enabled
if (
jobCreator.perPartitionCategorization &&
jobCreator.categorizationPerPartitionField === null
) {
jobCreator.categorizationPerPartitionField = filteredCategories[0].id;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to be safe, it would be worth adding a check for filteredCategories.length before access it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated in a94ecaf

}
setCategorizationPartitionFieldName(jobCreator.categorizationPerPartitionField);
setEnablePerPartitionCategorization(jobCreator.perPartitionCategorization);
}, [jobCreatorUpdated]);

const isCategorizationJob = isCategorizationJobCreator(jobCreator);
return (
<Description isOptional={isCategorizationJob === false}>
<EuiFormRow
label={i18n.translate(
'xpack.ml.newJob.wizard.extraStep.categorizationJob.perPartitionCategorizationLabel',
{
defaultMessage: 'Enable per-partition categorization',
}
)}
>
<CategorizationPerPartitionSwitch />
</EuiFormRow>

{enablePerPartitionCategorization && (
<>
<EuiFormRow
label={i18n.translate(
'xpack.ml.newJob.wizard.extraStep.categorizationJob.stopOnWarnLabel',
{
defaultMessage: 'Stop on warn',
}
)}
>
<CategorizationPerPartitionStopOnWarnSwitch />
</EuiFormRow>
<EuiFormRow
label={i18n.translate(
'xpack.ml.newJob.wizard.extraStep.categorizationJob.partitionFieldLabel',
{
defaultMessage: 'Partition field',
}
)}
>
<CategorizationPerPartitionFieldSelect
fields={filteredCategories}
changeHandler={setCategorizationPartitionFieldName}
selectedField={categorizationPartitionFieldName || ''}
/>
</EuiFormRow>
</>
)}
</Description>
);
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/

import React, { FC, useContext } from 'react';
import { EuiComboBox, EuiComboBoxOptionOption } from '@elastic/eui';

import { JobCreatorContext } from '../../../job_creator_context';
import { Field } from '../../../../../../../../../common/types/fields';
import { createFieldOptions } from '../../../../../common/job_creator/util/general';

interface Props {
fields: Field[];
changeHandler(i: string | null): void;
selectedField: string | null;
}

export const CategorizationPerPartitionFieldSelect: FC<Props> = ({
fields,
changeHandler,
selectedField,
}) => {
const { jobCreator } = useContext(JobCreatorContext);
const options: EuiComboBoxOptionOption[] = [
...createFieldOptions(fields, jobCreator.additionalFields),
];

const selection: EuiComboBoxOptionOption[] = [];
if (selectedField !== null) {
selection.push({ label: selectedField });
}

function onChange(selectedOptions: EuiComboBoxOptionOption[]) {
const option = selectedOptions[0];
if (typeof option !== 'undefined') {
changeHandler(option.label);
} else {
changeHandler(null);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It all gets invoked on each render. Please use hooks accordingly (useMemo and useCallback)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here a94ecaf


return (
<EuiComboBox
singleSelection={{ asPlainText: true }}
options={options}
selectedOptions={selection}
onChange={onChange}
isClearable={true}
data-test-subj="mlCategorizationPerPartitionFieldNameSelect"
/>
);
};
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
/*
* Copyright Elasticsearch B.V. and/or licensed to Elasticsearch B.V. under one
* or more contributor license agreements. Licensed under the Elastic License;
* you may not use this file except in compliance with the Elastic License.
*/

import React, { FC, useContext, useEffect, useState } from 'react';
import { i18n } from '@kbn/i18n';
import { EuiSwitch } from '@elastic/eui';
import { JobCreatorContext } from '../../../job_creator_context';
import { AdvancedJobCreator, CategorizationJobCreator } from '../../../../../common/job_creator';

export const CategorizationPerPartitionSwitch: FC = () => {
const { jobCreator: jc, jobCreatorUpdate } = useContext(JobCreatorContext);
const jobCreator = jc as AdvancedJobCreator | CategorizationJobCreator;
const [enablePerPartitionCategorization, setEnablePerPartitionCategorization] = useState(
jobCreator.perPartitionCategorization
);

const toggleEnablePerPartitionCategorization = () =>
setEnablePerPartitionCategorization(!enablePerPartitionCategorization);

useEffect(() => {
// also turn off stop on warn if per_partition_categorization is turned off
if (enablePerPartitionCategorization === false) {
jobCreator.partitionStopOnWarn = false;
}

jobCreator.perPartitionCategorization = enablePerPartitionCategorization;
jobCreatorUpdate();
}, [enablePerPartitionCategorization]);

return (
<EuiSwitch
name="switch"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch probably not the best name for this control, consider something more specific, e.g. categorizationPerPartitionSwitch 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here a94ecaf

disabled={false}
checked={enablePerPartitionCategorization}
onChange={toggleEnablePerPartitionCategorization}
data-test-subj="mlJobWizardSwitchCategorizationPerPartitionField"
label={i18n.translate('xpack.ml.newJob.wizard.perPartitionCategorizationSwitchLabel', {
defaultMessage: 'Enable per-partition categorization',
})}
/>
);
};
Loading