Skip to content

Commit

Permalink
[ML] adds start and end params to _preview and excludes cold/frozen t…
Browse files Browse the repository at this point in the history
…iers from unbounded previews (elastic#86989)

n larger clusters with complicated datafeed requirements, being able to preview only a specific window of time is important. Previously, datafeed previews would always start at 0 (or from the beginning of the data). This causes issues if the index pattern contains indices on slower hardware, but when the datafeed is actually started, the "start" time is set to more recent data (and thus on faster hardware).

Additionally, when _preview is unbounded (as before), it attempts to only preview indices that are NOT frozen or cold. This is done through a query against the _tier field. Meaning, it only effects newer indices that actually have that field set.
  • Loading branch information
benwtrent authored May 20, 2022
1 parent c154d26 commit 115f19f
Show file tree
Hide file tree
Showing 15 changed files with 291 additions and 34 deletions.
6 changes: 6 additions & 0 deletions docs/changelog/86989.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
pr: 86989
summary: Adds start and end params to `_preview` and excludes cold/frozen tiers from
unbounded previews
area: Machine Learning
type: enhancement
issues: []
37 changes: 33 additions & 4 deletions docs/reference/ml/anomaly-detection/apis/preview-datafeed.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -25,16 +25,16 @@ Previews a {dfeed}.

Requires the following privileges:

* cluster: `manage_ml` (the `machine_learning_admin` built-in role grants this
* cluster: `manage_ml` (the `machine_learning_admin` built-in role grants this
privilege)
* source index configured in the {dfeed}: `read`.

[[ml-preview-datafeed-desc]]
== {api-description-title}

The preview {dfeeds} API returns the first "page" of search results from a
The preview {dfeeds} API returns the first "page" of search results from a
{dfeed}. You can preview an existing {dfeed} or provide configuration details
for the {dfeed} and {anomaly-job} in the API. The preview shows the structure of
for the {dfeed} and {anomaly-job} in the API. The preview shows the structure of
the data that will be passed to the anomaly detection engine.

IMPORTANT: When {es} {security-features} are enabled, the {dfeed} query is
Expand All @@ -57,6 +57,35 @@ include::{es-repo-dir}/ml/ml-shared.asciidoc[tag=datafeed-id]
NOTE: If you provide the `<datafeed_id>` as a path parameter, you cannot
provide {dfeed} or {anomaly-job} configuration details in the request body.

[[ml-preview-datafeed-query-parms]]
== {api-query-parms-title}

`end`::
(Optional, string) The time that the {dfeed} preview should end. The preview may not go to the end of the provided value
as only the first page of results are returned. The time can be specified by using one of the following formats:
+
--
* ISO 8601 format with milliseconds, for example `2017-01-22T06:00:00.000Z`
* ISO 8601 format without milliseconds, for example `2017-01-22T06:00:00+00:00`
* Milliseconds since the epoch, for example `1485061200000`

Date-time arguments using either of the ISO 8601 formats must have a time zone
designator, where `Z` is accepted as an abbreviation for UTC time.

NOTE: When a URL is expected (for example, in browsers), the `+` used in time
zone designators must be encoded as `%2B`.

This value is exclusive.
--

`start`::
(Optional, string) The time that the {dfeed} preview should begin, which can be
specified by using the same formats as the `end` parameter. This value is
inclusive.

NOTE: If you don't provide either the `start` or `end` parameter, the {dfeed} preview will search over the entire
time of data but exclude data within `cold` or `frozen` <<data-tiers, data tiers>>.

[[ml-preview-datafeed-request-body]]
== {api-request-body-title}

Expand Down Expand Up @@ -115,7 +144,7 @@ The data that is returned for this example is as follows:
]
----

The following example provides {dfeed} and {anomaly-job} configuration
The following example provides {dfeed} and {anomaly-job} configuration
details in the API:

[source,console]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,18 @@
}
]
},
"params":{
"start":{
"type":"string",
"required":false,
"description":"The start time from where the datafeed preview should begin"
},
"end":{
"type":"string",
"required":false,
"description":"The end time when the datafeed preview should stop"
}
},
"body":{
"description":"The datafeed config and job config with which to execute the preview",
"required":false
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,12 @@
*/
package org.elasticsearch.xpack.core.ml.action;

import org.elasticsearch.Version;
import org.elasticsearch.action.ActionRequest;
import org.elasticsearch.action.ActionRequestValidationException;
import org.elasticsearch.action.ActionResponse;
import org.elasticsearch.action.ActionType;
import org.elasticsearch.action.ValidateActions;
import org.elasticsearch.common.Strings;
import org.elasticsearch.common.bytes.BytesReference;
import org.elasticsearch.common.io.stream.StreamInput;
Expand All @@ -28,6 +30,11 @@
import java.io.IOException;
import java.io.InputStream;
import java.util.Objects;
import java.util.OptionalLong;

import static org.elasticsearch.xpack.core.ml.action.StartDatafeedAction.DatafeedParams.parseDateOrThrow;
import static org.elasticsearch.xpack.core.ml.action.StartDatafeedAction.END_TIME;
import static org.elasticsearch.xpack.core.ml.action.StartDatafeedAction.START_TIME;

public class PreviewDatafeedAction extends ActionType<PreviewDatafeedAction.Response> {

Expand All @@ -49,38 +56,61 @@ public static class Request extends ActionRequest implements ToXContentObject {
static {
PARSER.declareObject(Builder::setDatafeedBuilder, DatafeedConfig.STRICT_PARSER, DATAFEED_CONFIG);
PARSER.declareObject(Builder::setJobBuilder, Job.STRICT_PARSER, JOB_CONFIG);
PARSER.declareString(Builder::setStart, START_TIME);
PARSER.declareString(Builder::setEnd, END_TIME);
}

public static Request fromXContent(XContentParser parser, @Nullable String datafeedId) {
public static Request.Builder fromXContent(XContentParser parser, @Nullable String datafeedId) {
Builder builder = PARSER.apply(parser, null);
// We don't need to check for "inconsistent ids" as we don't parse an ID from the body
if (datafeedId != null) {
builder.setDatafeedId(datafeedId);
}
return builder.build();
return builder;
}

private final String datafeedId;
private final DatafeedConfig datafeedConfig;
private final Job.Builder jobConfig;
private final Long startTime;
private final Long endTime;

public Request(StreamInput in) throws IOException {
super(in);
datafeedId = in.readString();
datafeedConfig = in.readOptionalWriteable(DatafeedConfig::new);
jobConfig = in.readOptionalWriteable(Job.Builder::new);
if (in.getVersion().onOrAfter(Version.V_8_3_0)) {
this.startTime = in.readOptionalLong();
this.endTime = in.readOptionalLong();
} else {
this.startTime = null;
this.endTime = null;
}
}

public Request(String datafeedId) {
public Request(String datafeedId, String start, String end) {
this.datafeedId = ExceptionsHelper.requireNonNull(datafeedId, DatafeedConfig.ID);
this.datafeedConfig = null;
this.jobConfig = null;
this.startTime = start == null ? null : parseDateOrThrow(start, START_TIME, System::currentTimeMillis);
this.endTime = end == null ? null : parseDateOrThrow(end, END_TIME, System::currentTimeMillis);
}

public Request(DatafeedConfig datafeedConfig, Job.Builder jobConfig) {
Request(String datafeedId, Long start, Long end) {
this.datafeedId = ExceptionsHelper.requireNonNull(datafeedId, DatafeedConfig.ID);
this.datafeedConfig = null;
this.jobConfig = null;
this.startTime = start;
this.endTime = end;
}

public Request(DatafeedConfig datafeedConfig, Job.Builder jobConfig, Long start, Long end) {
this.datafeedId = BLANK_ID;
this.datafeedConfig = ExceptionsHelper.requireNonNull(datafeedConfig, DATAFEED_CONFIG.getPreferredName());
this.jobConfig = jobConfig;
this.startTime = start;
this.endTime = end;
}

public String getDatafeedId() {
Expand All @@ -95,9 +125,31 @@ public Job.Builder getJobConfig() {
return jobConfig;
}

public OptionalLong getStartTime() {
return startTime == null ? OptionalLong.empty() : OptionalLong.of(startTime);
}

public OptionalLong getEndTime() {
return endTime == null ? OptionalLong.empty() : OptionalLong.of(endTime);
}

@Override
public ActionRequestValidationException validate() {
return null;
ActionRequestValidationException e = null;
if (endTime != null && startTime != null && endTime <= startTime) {
e = ValidateActions.addValidationError(
START_TIME.getPreferredName()
+ " ["
+ startTime
+ "] must be earlier than "
+ END_TIME.getPreferredName()
+ " ["
+ endTime
+ "]",
e
);
}
return e;
}

@Override
Expand All @@ -106,6 +158,10 @@ public void writeTo(StreamOutput out) throws IOException {
out.writeString(datafeedId);
out.writeOptionalWriteable(datafeedConfig);
out.writeOptionalWriteable(jobConfig);
if (out.getVersion().onOrAfter(Version.V_8_3_0)) {
out.writeOptionalLong(startTime);
out.writeOptionalLong(endTime);
}
}

@Override
Expand Down Expand Up @@ -147,6 +203,8 @@ public static class Builder {
private String datafeedId;
private DatafeedConfig.Builder datafeedBuilder;
private Job.Builder jobBuilder;
private Long startTime;
private Long endTime;

public Builder setDatafeedId(String datafeedId) {
this.datafeedId = datafeedId;
Expand All @@ -163,6 +221,30 @@ public Builder setJobBuilder(Job.Builder jobBuilder) {
return this;
}

public Builder setStart(String startTime) {
if (startTime == null) {
return this;
}
return setStart(parseDateOrThrow(startTime, START_TIME, System::currentTimeMillis));
}

public Builder setStart(long start) {
this.startTime = start;
return this;
}

public Builder setEnd(String endTime) {
if (endTime == null) {
return this;
}
return setEnd(parseDateOrThrow(endTime, END_TIME, System::currentTimeMillis));
}

public Builder setEnd(long end) {
this.endTime = end;
return this;
}

public Request build() {
if (datafeedBuilder != null) {
datafeedBuilder.setId("preview_id");
Expand Down Expand Up @@ -196,8 +278,8 @@ public Request build() {
);
}
return datafeedId != null
? new Request(datafeedId)
: new Request(datafeedBuilder == null ? null : datafeedBuilder.build(), jobBuilder);
? new Request(datafeedId, startTime, endTime)
: new Request(datafeedBuilder == null ? null : datafeedBuilder.build(), jobBuilder, startTime, endTime);
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ public static class DatafeedParams implements PersistentTaskParams, MlTaskParams
);
}

static long parseDateOrThrow(String date, ParseField paramName, LongSupplier now) {
public static long parseDateOrThrow(String date, ParseField paramName, LongSupplier now) {
DateMathParser dateMathParser = DateFieldMapper.DEFAULT_DATE_TIME_FORMATTER.toDateMathParser();

try {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,11 +28,18 @@ public class PreviewDatafeedActionRequestTests extends AbstractWireSerializingTe
@Override
protected Request createTestInstance() {
String jobId = randomAlphaOfLength(10);
long start = randomLongBetween(0, Long.MAX_VALUE / 4);
return switch (randomInt(2)) {
case 0 -> new Request(randomAlphaOfLength(10));
case 0 -> new Request(
randomAlphaOfLength(10),
randomBoolean() ? null : start,
randomBoolean() ? null : randomLongBetween(start + 1, Long.MAX_VALUE)
);
case 1 -> new Request(
DatafeedConfigTests.createRandomizedDatafeedConfig(jobId),
randomBoolean() ? JobTests.buildJobBuilder(jobId) : null
randomBoolean() ? JobTests.buildJobBuilder(jobId) : null,
randomBoolean() ? null : start,
randomBoolean() ? null : randomLongBetween(start + 1, Long.MAX_VALUE)
);
case 2 -> new Request.Builder().setJobBuilder(
JobTests.buildJobBuilder(jobId)
Expand All @@ -48,7 +55,7 @@ protected Writeable.Reader<Request> instanceReader() {
}

public void testCtor() {
IllegalArgumentException ex = expectThrows(IllegalArgumentException.class, () -> new Request((String) null));
IllegalArgumentException ex = expectThrows(IllegalArgumentException.class, () -> new Request(null, randomLong(), null));
assertThat(ex.getMessage(), equalTo("[datafeed_id] must not be null."));
}

Expand Down
Loading

0 comments on commit 115f19f

Please sign in to comment.