Skip to content

Commit

Permalink
fix rst
Browse files Browse the repository at this point in the history
  • Loading branch information
skadilover committed Sep 20, 2024
1 parent 9360427 commit b963239
Showing 1 changed file with 15 additions and 5 deletions.
20 changes: 15 additions & 5 deletions velox/docs/functions/presto/regexp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,11 +32,21 @@ limited to 20 different expressions per instance and thread of execution.
SELECT like('abc', '%b%'); -- true
SELECT like('a_c', '%#_%', '#'); -- true

String sequence search: There are some patterns that are equivalent to simple
string searches. For such constant patterns without custom-escape, velox uses
substring searches instead of regex searches, for example:
like("hello velox", "%hello%velox%")
is equivalent to searching for the strings "hello", "velox" in sequence.
Not all patterns require compilation of regular expressions.

| patterns | description | fast search mothod |
| :------------ |:----------------------------:| -------------------:|
| kExactlyN | Pattern containing wildcard character '_' only, such as _, __, ____. | only check string length equal to N |
| kAtLeastN | Pattern containing wildcard characters ('_' or '%') only with at least one '%', such as ___%, _%__. | only check string length >=N |
| kFixed | Pattern with no wildcard characters, such as 'presto', 'foo' | only check if input string equals pattern |
| kRelaxedFixed | Pattern with single wildcard chars(_) & normal chars, such as '_pr_es_to_'. | only check if input string equals sub-patterns, ignore wildcard chars |
| kPrefix | Fixed pattern followed by one or more '%', such as 'hello%', 'foo%%%%'. | only check if the prefix of input string matches pattern |
| kRelaxedPrefix | kRelaxedFixed pattern followed by one or more '%', such as '_pr_es_to_%', '_pr_es_to_%%%%'. | only check the prefix of input string matches pattern, and ignore wildcard |
| kSuffix | Fixed pattern preceded by one or more '%', such as '%foo', '%%%hello'. | only check if the suffix of input string matches pattern |
| kRelaxedSuffix | kRelaxedFixed preceded by one or more '%', such as '%_pr_es_to_', '%%%_pr_es_to_'. | only check if the suffix of input string matches pattern, ignore wildcard chars |
| kSubstring | Patterns matching '%{c0}%', such as '%foo%%', '%%%hello%'. | only check if the input string contains the pattern |
| kSubstrings | Patterns matching '%{c0}%{c1}%', such as '%%foo%%bar%%', '%foo%bar%'. Note: unlike kSubstring, kSubstrings applies only to constant patterns as pattern parsing is expensive, and not supports wildcard chars(_) and chars(#), such as '%%foo_bar%pre_sto%', '%%foo#bar#pre#sto%' | searching sub-strings in sequence, sunch as "%pr%es%to%" is equivalent to searching for the strings "pr", "es", "to" in sequence. |

.. function:: regexp_extract(string, pattern) -> varchar

Expand Down

0 comments on commit b963239

Please sign in to comment.