Skip to content

Commit

Permalink
cuckooFilter threshold param
Browse files Browse the repository at this point in the history
  • Loading branch information
iverase committed Jun 25, 2021
1 parent 7ab9b97 commit 1168913
Show file tree
Hide file tree
Showing 13 changed files with 567 additions and 31 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,15 @@ CuckooFilters are described in more detail in the paper:
https://www.cs.cmu.edu/~dga/papers/cuckoo-conext2014.pdf[Fan, Bin, et al. "Cuckoo filter: Practically better than bloom."]
Proceedings of the 10th ACM International on Conference on emerging Networking Experiments and Technologies. ACM, 2014.


==== Threshold
The `threshold` parameter defines the number of unique values that are collected using the naive approach and therefore
doc counts are accurate. Above this value, cuckoo filters are used to collect the values and therefore doc counts might
become a bit more fuzzy.

The value of this parameter must be a positive integer with a maximum supported value of 500000.
The default value is `10000`.

==== Precision

Although the internal CuckooFilter is approximate in nature, the false-negative rate can be controlled with a
Expand Down Expand Up @@ -352,4 +361,4 @@ that require `depth_first`. In particular, scoring sub-aggregations that are ins
in `depth_first` mode. This will throw an exception since RareTerms is unable to process `depth_first`.

As a concrete example, if `rare_terms` aggregation is the child of a `nested` aggregation, and one of the child aggregations of `rare_terms`
needs document scores (like a `top_hits` aggregation), this will throw an exception.
needs document scores (like a `top_hits` aggregation), this will throw an exception.
Loading

0 comments on commit 1168913

Please sign in to comment.