04 Jan 11:51

curquiza

6203f4a

v1.6.0-rc.4 🦊 Pre-release

Pre-release

Fixes

Fix issue with hybrid/vector search (#4296) @dureuill
Fix CI issue (#4294) @irevoire

Contributors

irevoire and dureuill

Assets 7

26 Dec 10:04

curquiza

v1.6.0-rc.3

658ec6e

v1.6.0-rc.3 🦊 Pre-release

Pre-release

Improvements

Add task queue webhook (#4238) @irevoire
Use the environment variables MEILI_TASK_WEBHOOK_URL and MEILI_TASK_WEBHOOK_AUTHORIZATION_HEADER to define your webhook URL and your authorization header.
More information about the usage here.
Simplify the settings format of new vector search settings (compared to previous RCs) (#4275) @dureuill

Before:

{
  "embedders": {
    "default": {
      "source": {
        "openAi": {
          "apiKey": "<your-OpenAI-API-key>",
          "model": "text-embedding-ada-002"
        }
      },
      "documentTemplate": {
        "template": "A movie titled '{{doc.title}}' whose description starts with {{doc.overview|truncatewords: 20}}"
      }
    },
    "image": {
      "source": { "userProvided": { "dimensions": 512 } }
    },
    "translation": {
      "source": {
        "huggingFace": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2" }
      },
      "documentTemplate": {
        "template": "A movie titled '{{doc.title}}' whose description starts with {{doc.overview|truncatewords: 20}}"
      }
    }
  }
}

After:

{
  "embedders": {
    "default": {
      "source":  "openAi",
      "apiKey": "<your-OpenAI-API-key>",
      "model": "text-embedding-ada-002",
      "documentTemplate": "A movie titled '{{doc.title}}' whose description starts with {{doc.overview|truncatewords: 20}}"
    },
    "image": {
      "source": "userProvided",
      "dimensions": 512
    },
    "translation": {
      "source": "huggingFace",
      "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2",
      "documentTemplate": "A movie titled '{{doc.title}}' whose description starts with {{doc.overview|truncatewords: 20}}"
    }
  }
}

Update mini-dashboard to v0.2.12 (#4277) @mdubus

Contributors

irevoire, mdubus, and dureuill

Assets 7

20 Dec 15:27

curquiza

v1.6.0-rc.2

de2ca70

v1.6.0-rc.2 Pre-release

Pre-release

Fixes

Fix failure when setting a model in the embedders index settings (#4272) @dureuill

Thanks to @scottaglia for the report ❤️

Contributors

scottaglia and dureuill

Assets 7

19 Dec 15:06

curquiza

v1.6.0-rc.1

fb9db1e

🦊 v1.6.0-rc.1 Pre-release

Pre-release

⚠️ Use this RC instead of rc0 if you use Docker

Fixes

Docker image now works again (#4268 & #4269) @sanders41 and @dureuill

Thanks again to our dear contributor @sanders41 for the report and the fix! ❤️

Contributors

sanders41 and dureuill

Assets 7

18 Dec 14:43

curquiza

v1.6.0-rc.0

248aaa6

🦊 v1.6.0-rc.0 Pre-release

Pre-release

⚠️ If you use the Meilisearch Docker image, please use rc1 or later instead.

⚠️ Since this is a release candidate (RC), we do NOT recommend using it in a production environment. Is something not working as expected? We welcome bug reports and feedback about new features.

Since we know the indexing time of Meilisearch is a real pain point for some of our users, Meilisearch v1.6 focuses mainly on indexing performances. But this new version is not only about optimization: Meilisearch now includes embedders for the vector search. You can benefit from the power of Meilisearch with semantic and hybrid searches!

New features and improvements 🔥

Experimental: improve vector search

Meilisearch introduces a hybrid search mechanism that allows users to mix full-text and semantic search at search time to provide more accurate and comprehensive results.

Plus, you can directly define the embedders you want to use, so you don't need to interact with a third party on your side to generate embeddings: Meilisearch will interact with it for you.

Settings

Before using hybrid search, you need to define an embedder in your settings. You can even define multiple embedders in the index settings.

You must set them via the /PATCH indexes/:index_uid/settings route. Here is an example of a payload defining 3 embedders named default, image and translation:

{
  "embedders": {
    "default": {
      "source": {
        "openAi": {
          "apiKey": "<your-OpenAI-API-key>",
          "model": "text-embedding-ada-002"
        }
      },
      "documentTemplate": {
        "template": "A movie titled '{{doc.title}}' whose description starts with {{doc.overview|truncatewords: 20}}"
      }
    },
    "image": {
      "source": { "userProvided": { "dimensions": 512 } }
    },
    "translation": {
      "source": {
        "huggingFace": { "model": "sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2" }
      },
      "documentTemplate": {
        "template": "A movie titled '{{doc.title}}' whose description starts with {{doc.overview|truncatewords: 20}}"
      }
    }
  }
}

documentTemplate is a view of your document that will serve as the base for computing the embedding. This field is a JSON string expecting Liquid format.

3 kinds of embedders are available for source:

openAI: Use the OpenAI API to auto-embed your documents, so your OpenAI API key must be specified.
huggingFace: download the provided model from HuggingFace Hub locally to auto-embed the documents and query.
userProvided: you send document vectors and query vectors, meaning you have computed the embeddings on your side. It's similar to the v1.3 version of Meilisearch in the way it works; the exception is that a specific embedder must be defined, which wasn’t required before, cf the next section about breaking changes.

⚠️ If using the HuggingFace model, the computation will be done on your machine and will use your CPU (not your GPU), which can lead to bad indexing performance.

Hybrid & semantic search

You can perform a hybrid search by using the hybrid field when calling the POST /index/:index_uid/search route.

Here is an example of a hybrid search payload:

{
    "q": "Plumbers and dinosaurs",
    "hybrid": {
        "semanticRatio": 0.9,
        "embedder": "default"
    }
}

embedder is the embedder you choose to perform the search among the ones you defined in your settings.
semanticRatio: the value should be between 0 and 1. The default value is 0,5. 1 corresponds to a full semantic search, whereas 0 is about a full-text search.

⚠️ Breaking changes for beta users of the previous version of vector search

For people who used Meilisearch with the experimental vector search feature (between v1.3.0 and v1.5.0), some changes happened in the API usage:

Before, sending your vectors without defining any models to use vector search was possible. Now, we have to define a model in the settings.

"embedders": {
    "default": {
      "source": { "userProvided": { "dimensions": 512 } }
    }
}

Meilisearch now supports multiple embedders, so the format for sending documents and vectors changed. Vectors are not arrays anymore but JSON objects.

Before, in your document you provided:

"_vectors": [
  [0.0, 0.1]
]

Now the format is:

"_vectors": {
  "image2text": [0.0, 0.1, ...]
}

To know more about the new usage, refer to the sections above about settings or to the documentation

More technical information

You can check out

this article
Arroy, the opensource repository based on Annoy, written in Rust. This repository is created and maintained by the Meilisearch engine team. This is a Rust library to search for vectors in space that are close to a given query vector

Done in #4226 by @dureuill, @irevoire, @Kerollmops and @ManyTheFish.

Improve indexing speed

This version introduces huge indexing performance improvements. Meilisearch has been optimized to

store and pre-compute less data than in the previous versions
re-index and delete only the necessary data when updating a document. For instance, if updating only one field in the document, Meilisearch will only recompute this field and will no longer re-index the complete document.

Some metrics: on an e-commerce dataset of 2.5Gb of documents, we noticed more than a 50% time reduction when adding documents for the first time. With a scenario updating the documents frequently and partially, the reduction is about 50% or even 75%. Most of all, the indexing time does not exponentially increase anymore.

⚠️ Performance improvements can highly depend on your dataset, the size of your machine and the way of indexing documents.

Done in #4090 by @ManyTheFish, @dureuill and @Kerollmops.

Disk space usage reduction

We made improvements regarding disk space usage. Meilisearch now stores less internal data, so require a smaller database on your disk.

With a ~15Mb dataset, the created database is 40% and 50% smaller. Additionally, after several updates, the database size becomes more stable, which was not the case before. So, the more you add documents, the more this improvement will be visible.

Customize proximity precision to gain indexing performance

Still, in the purpose of reducing the indexing speed, you can now customize the accuracy of the proximity ranking rules based on your needs.

However, the computation needed for the proximity ranking rule is huge and can lead to a big indexing time. Since the proximity ranking rule purpose for the search relevancy is not always necessary for your use case, you now have the possibility to make it less relevant to reduce the indexing speed. Indeed, depending on your use case, the relevancy impact can even be invisible.

Use the proximityPrecision settings:

curl \
  -X PATCH 'http://localhost:7700/indexes/books/settings/proximity-precision' \
  -H 'Content-Type: application/json'  \
  --data-binary '{
    "proximityPrecision": "byAttribute"
  }'

The default value of proximityPrecision is byWord. byAttribute will improve your indexing performance but can impact the relevancy.

Technical explanations: byWord considers the proximity as an exact distance between words, whereas byAttribute considers the proximity as if the words are in the same attribute or not, making it less accurate.

Done in #4225 by @ManyTheFish.

Experimental: limit the number of batched tasks

To speed up indexing performance, Meilisearch batches similar tasks to process them as a big batch. However, sometimes, the huge amount of enqueued tasks leads to issues with Meilisearch crashing or being stuck.

To limit the number of batched tasks, you can configure it launch: use this environment variable MEILI_EXPERIMENTAL_MAX_NUMBER_OF_BATCHED_TASKS, the CLI argument --experimental-max-number-of-batched-task when launching Meilisearch, or directly in the config file.

Done in #4249 by @Kerollmops

Fixes 🐞

The dump tasks are now cancellable (#4208) @irevoire
Fix: the payload size limit is now also applied on /documents/delete-batch (#4231) @Karribalu
Fix: typo tolerance is ineffective for attributes with similar content (related issue: #4256)
Fix: the geosort is no longer ignored after the first bucket of a preceding sort ranking rule (#4226)

Misc

Dependencies upgrade
- Updating CI dependencies
- Update to heed 0.20 (#4223) @Kerollmops
- Set rust toolchain to 1.71.1 in dockerfile (#4261) @dureuill
Documentation
- Remove banner (#4191) @curquiza
Misc
- Extract the creation and last updated timestamp from v2 dumps (#4132) @vivek-26
- Fix puffin in the index scheduler (#4234) @irevoire
- Remove the actix-web dependency from (#4239) @Kerollmops

❤️ Thanks again to our external contributors:

Meilisearch: @Karribalu, and @vivek-26
Charabia: TBD

Contributors

Kerollmops, ManyTheFish, and 5 other contributors

Assets 7

13 Dec 09:43

Kerollmops

v1.5.1

afa8f27

v1.5.1 🦙

Bug fix

Fix search on exact attributes using attributeToSearchOn by (#4233) @ManyTheFish

Thanks @tobz-nz, for the bug report, and @ManyTheFish, for the implementation ❤️

Contributors

tobz-nz and ManyTheFish

Assets 8

20 Nov 09:16

Kerollmops

v1.5.0

b11f85a

v1.5.0 🦙

Meilisearch v1.5 introduces improvements in indexing speed and the possibility of triggering snapshots on demand.

🧰 All official Meilisearch integrations (including SDKs, clients, and other tools) are compatible with this Meilisearch release. Integration deployment happens between 4 to 48 hours after a new version becomes available.

Some SDKs might not include all new features—consult the project repository for detailed information. Is a feature you need missing from your chosen SDK? Create an issue letting us know you need it, or, for open-source karma points, open a PR implementing it (we'll love you for that ❤️).

New features and improvements 🔥

Indexing speed improvements

v1.5 improves indexing speed for text-heavy datasets. Datasets with fields containing more than 100 words should see a reduction of 5% to 20% to indexing times, with gains proportional to the amount of words in a document.

This might result in minor impact to search result relevancy for queries containing 4 words or more. Contact us in our GitHub Discussions page if this is significantly affecting your application.

Indexing speed improvements might not be visible in datasets with fewer than 20 words per field, regardless of how many fields each document contains.

Done by @ManyTheFish in #4131

Snapshots on-demand

This release introduces a new /snapshots API route for creating snapshots on demand:

curl -X POST HTTP://localhost:7700/snapshots

This route returns a summarized task object.

By default, Meilisearch creates snapshots inside the /snapshots directory. You can customize this directory with the --snapshot-dir configuration option.

Done by @irevoire in #4051.

Experimental feature: Export Puffin reports

This experimental feature allows Meilisearch to automatically export .puffin reports. .puffin files provide information on Meilisearch's internal processes and may be useful when diagnosing performance issues.

Use the /experimental-features endpoint to activate this feature:

curl \
  -X PATCH 'http://localhost:7700/experimental-features/' \
  -H 'Content-Type: application/json'  \
  --data-binary '{
    "exportPuffinReports": true
  }'

📣 Consult the GitHub discussion for more information.

Done by @Kerollmops in #4073.

Other improvements

The experimental /metrics route can now be activated via HTTP. Done by @braddotcoffee with the review of @vivek-26 in #4126. ⚠️ Avoid using CLI flag and the API at the same time when managing experimental features.
Add Khmer language support (#4169 and meilisearch/charabia#203) @xshadowlegendx and @ManyTheFish
Integrate the meilitool command line interface into the meilisearch Docker image (#4167) @Kerollmops
This tool provides commands to enforce the cancellation of tasks and the creation of dumps for stuck Meilisearch instances.
In the running Meilisearch container, just do meilitool --help to get the usage.

Fixes 🐞

Throw an error when the vector in a search query does not match the size of the already indexed vectors (#4204) @dureuill
Prevent the search on the processing index from hanging (#4205) @dureuill

Misc

Update dependencies
- Bump webpki to 0.22.2 (#4101)
- Bump rustls-webpki from 0.100.1 to 0.100.2 (#4009)
CIs and tests
- Add CI to trigger benchmarks in PR (#4102) @Kerollmops
- Improve test-suite.yml to prevent CI from failing when disabling tokenization (#4005) @harshau007
- Add more integrations to SDK CI (#4044) @curquiza
- Dependency issue is now created every 6 months (#4065) @curquiza
- Rename benchmark CI file so it is easier to find in the manifest list (#4125) @curquiza
- Update CI dependencies
- Fix warning in CI (#4174) @irevoire
Misc
- Enable analytics in debug builds (#4074) @irevoire
- Rewrite segment_analytics module with destructuring syntax (#4056) @vivek-26

❤️ Thanks again to our external contributors:

Meilisearch: @braddotcoffee, @harshau007, and @vivek-26.
Charabia: @choznerol and @xshadowlegendx.

Contributors

Kerollmops, ManyTheFish, and 8 other contributors

Assets 8

0 Join discussion

14 Nov 08:56

Kerollmops

v1.5.0-rc.3

b11f85a

v1.5.0-rc.3 🦙 Pre-release

Pre-release

Fixes

Throw error when the vector search query does not match the size of the already indexed vectors by @dureuill in #4204
Prevent the search on the processing index from hanging by @dureuill in #4205

Contributors

dureuill

Assets 7

31 Oct 16:45

curquiza

v1.5.0-rc.2

54f0ee1

v1.5.0-rc.2 🦙 Pre-release

Pre-release

Enhancement

Integrate the meilitool command line interface in the meilisearch Docker image (#4167) @Kerollmops
In the running Meilisearch container, just do meilitool --help to get the usage.

Contributors

Kerollmops

Assets 7

30 Oct 12:23

curquiza

v1.5.0-rc.1

2614e7d

v1.5.0-rc.1 🦙 Pre-release

Pre-release

Enhancement

Add Khmer language support by bumping Charabia version to v0.8.5 (#4169) @ManyTheFish and @xshadowlegendx

Fixes

Update the version to the right release (v1.5.0 instead of v1.4.0) (#4154)
Fix warning in CI (#4174) @irevoire

Thanks @choznerol and @xshadowlegendx for the contributions on Charabia ❤️

Contributors

ManyTheFish, irevoire, and 2 other contributors

Assets 7

Releases: meilisearch/meilisearch

v1.6.0-rc.4 🦊

Fixes

Contributors

v1.6.0-rc.3 🦊

Improvements

Contributors

v1.6.0-rc.2

Fixes

Contributors

🦊 v1.6.0-rc.1

Fixes

Contributors

🦊 v1.6.0-rc.0

New features and improvements 🔥

Experimental: improve vector search

Settings

Hybrid & semantic search

⚠️ Breaking changes for beta users of the previous version of vector search

More technical information

Improve indexing speed

Disk space usage reduction

Customize proximity precision to gain indexing performance

Experimental: limit the number of batched tasks

Fixes 🐞

Misc

Contributors

v1.5.1 🦙

Bug fix

Contributors

v1.5.0 🦙

New features and improvements 🔥

Indexing speed improvements

Snapshots on-demand

Experimental feature: Export Puffin reports

Other improvements

Fixes 🐞

Misc

Contributors

v1.5.0-rc.3 🦙

Fixes

Contributors

v1.5.0-rc.2 🦙

Enhancement

Contributors

v1.5.0-rc.1 🦙

Enhancement

Fixes

Contributors