[ML] Enable built-in Inference Endpoints and default for Semantic Text #116931

davidkyle · 2024-11-18T12:20:23Z

Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch) and multilingual-e5-small models (.multilingual-e5-small-elasticsearch). These endpoints will always appear in the GET _inference API, the models are automatically downloaded and deployed on a call to POST _inference. The endpoint use adaptive allocations with scale to 0 enabled (min_number_of_allocations: 0), after 15 minutes of inactivity the model deployment will scale down to 0 allocations at which point they use 0 resources and the node they are running on may scale down. Model deployments scale up again when the models are used. The built-in endpoints start with a . prefix and are suffixed with the name of the service that hosts them, in this case elasticsearch.

The Semantic Text field mapping defaults the inference_id option to the built-in ELSER inference endpoint. Indexing a document with a semantic text field mapping with trigger the download and deployment of the model.

GET _inference/_all
{
  "endpoints": [
    {
      "inference_id": ".elser-2-elasticsearch",
      "task_type": "sparse_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".elser_model_2",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 32
        }
      },
      "chunking_settings": {
        "strategy": "word",
        "max_chunk_size": 250,
        "overlap": 100
      }
    },
    {
      "inference_id": ".multilingual-e5-small-elasticsearch",
      "task_type": "text_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".multilingual-e5-small",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 32
        }
      },
      "chunking_settings": {
        "strategy": "word",
        "max_chunk_size": 250,
        "overlap": 100
      }
    }
  ]

elasticsearchmachine · 2024-11-18T12:20:47Z

Pinging @elastic/ml-core (Team:ML)

elasticsearchmachine · 2024-11-18T12:20:47Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elasticsearchmachine · 2024-11-18T12:20:47Z

Hi @davidkyle, I've created a changelog YAML for you.

docs/changelog/116931.yaml

elasticsearchmachine · 2024-11-18T13:41:56Z

💔 Backport failed

The backport operation could not be completed due to the following error:

An unexpected error occurred when attempting to backport this PR.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 116931

elastic#116931) Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch) and multilingual-e5-small models (.multilingual-e5-small-elasticsearch). The semantic text inference Id field now defaults to elser-2-elasticsearch # Conflicts: # test/test-clusters/src/main/java/org/elasticsearch/test/cluster/FeatureFlag.java # x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java

elastic#116931) Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch) and multilingual-e5-small models (.multilingual-e5-small-elasticsearch). The semantic text inference Id field now defaults to elser-2-elasticsearch

davidkyle and others added 3 commits November 18, 2024 11:51

Remove default endpoint FF

9115d13

Remove feature flag dependency

56fa6a6

Change reason

4ca74fc

davidkyle added >enhancement :ml Machine learning :Search Relevance/Vectors Vector search v9.0.0 v8.17.0 labels Nov 18, 2024

elasticsearchmachine added Team:ML Meta label for the ML team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Nov 18, 2024

Update docs/changelog/116931.yaml

f85765b

davidkyle mentioned this pull request Nov 18, 2024

Always Enable Default ELSER Endpoint For Semantic Text #116841

Closed

davidkyle commented Nov 18, 2024

View reviewed changes

docs/changelog/116931.yaml Outdated Show resolved Hide resolved

Update docs/changelog/116931.yaml

c343b40

Mikep86 approved these changes Nov 18, 2024

View reviewed changes

davidkyle added the auto-backport Automatically create backport pull requests when merged label Nov 18, 2024

davidkyle merged commit 9790cc4 into elastic:main Nov 18, 2024
16 checks passed

elasticsearchmachine added the backport pending label Nov 18, 2024

davidkyle mentioned this pull request Nov 18, 2024

[8.17][ML] Enable built-in Inference Endpoints and default for Semantic Text #116952

Merged

pgayvallet mentioned this pull request Nov 20, 2024

ELSER / interface endpoint installation and management elastic/kibana#192461

Closed

This was referenced Nov 20, 2024

[Obs AI Assistant] Use preconfigured elser inference endpoint elastic/kibana#200908

Open

[Obs AI Assistant] Use preconfigured Elser inference endpoint elastic/kibana#201108

Closed

pgayvallet mentioned this pull request Dec 17, 2024

[Product documentation] Use default ELSER deployment elastic/kibana#204559

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ML] Enable built-in Inference Endpoints and default for Semantic Text #116931

[ML] Enable built-in Inference Endpoints and default for Semantic Text #116931

davidkyle commented Nov 18, 2024

elasticsearchmachine commented Nov 18, 2024

elasticsearchmachine commented Nov 18, 2024

elasticsearchmachine commented Nov 18, 2024

elasticsearchmachine commented Nov 18, 2024

[ML] Enable built-in Inference Endpoints and default for Semantic Text #116931

[ML] Enable built-in Inference Endpoints and default for Semantic Text #116931

Conversation

davidkyle commented Nov 18, 2024

elasticsearchmachine commented Nov 18, 2024

elasticsearchmachine commented Nov 18, 2024

elasticsearchmachine commented Nov 18, 2024

elasticsearchmachine commented Nov 18, 2024

💔 Backport failed