Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Enable built-in Inference Endpoints and default for Semantic Text #116931

Merged
merged 5 commits into from
Nov 18, 2024

Conversation

davidkyle
Copy link
Member

Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch) and multilingual-e5-small models (.multilingual-e5-small-elasticsearch). These endpoints will always appear in the GET _inference API, the models are automatically downloaded and deployed on a call to POST _inference. The endpoint use adaptive allocations with scale to 0 enabled (min_number_of_allocations: 0), after 15 minutes of inactivity the model deployment will scale down to 0 allocations at which point they use 0 resources and the node they are running on may scale down. Model deployments scale up again when the models are used. The built-in endpoints start with a . prefix and are suffixed with the name of the service that hosts them, in this case elasticsearch.

The Semantic Text field mapping defaults the inference_id option to the built-in ELSER inference endpoint. Indexing a document with a semantic text field mapping with trigger the download and deployment of the model.

GET _inference/_all
{
  "endpoints": [
    {
      "inference_id": ".elser-2-elasticsearch",
      "task_type": "sparse_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".elser_model_2",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 32
        }
      },
      "chunking_settings": {
        "strategy": "word",
        "max_chunk_size": 250,
        "overlap": 100
      }
    },
    {
      "inference_id": ".multilingual-e5-small-elasticsearch",
      "task_type": "text_embedding",
      "service": "elasticsearch",
      "service_settings": {
        "num_threads": 1,
        "model_id": ".multilingual-e5-small",
        "adaptive_allocations": {
          "enabled": true,
          "min_number_of_allocations": 0,
          "max_number_of_allocations": 32
        }
      },
      "chunking_settings": {
        "strategy": "word",
        "max_chunk_size": 250,
        "overlap": 100
      }
    }
  ]


@elasticsearchmachine elasticsearchmachine added Team:ML Meta label for the ML team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Nov 18, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @davidkyle, I've created a changelog YAML for you.

@davidkyle davidkyle added the auto-backport Automatically create backport pull requests when merged label Nov 18, 2024
@davidkyle davidkyle merged commit 9790cc4 into elastic:main Nov 18, 2024
16 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

An unexpected error occurred when attempting to backport this PR.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 116931

davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request Nov 18, 2024
elastic#116931)

Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch)
and multilingual-e5-small models (.multilingual-e5-small-elasticsearch).
The semantic text inference Id field now defaults to elser-2-elasticsearch
# Conflicts:
#	test/test-clusters/src/main/java/org/elasticsearch/test/cluster/FeatureFlag.java
#	x-pack/plugin/inference/src/main/java/org/elasticsearch/xpack/inference/InferenceFeatures.java
alexey-ivanov-es pushed a commit to alexey-ivanov-es/elasticsearch that referenced this pull request Nov 28, 2024
elastic#116931)

Adds built-in inference endpoints for the ELSER (.elser-2-elasticsearch)
and multilingual-e5-small models (.multilingual-e5-small-elasticsearch).
The semantic text inference Id field now defaults to elser-2-elasticsearch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged backport pending >enhancement :ml Machine learning :Search Relevance/Vectors Vector search Team:ML Meta label for the ML team Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v8.17.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants