Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: review FunctionInfo for all functions #106346

Closed
astefan opened this issue Mar 14, 2024 · 4 comments · Fixed by #106454
Closed

ESQL: review FunctionInfo for all functions #106346

astefan opened this issue Mar 14, 2024 · 4 comments · Fixed by #106454
Assignees
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)

Comments

@astefan
Copy link
Contributor

astefan commented Mar 14, 2024

Description

#106222 revealed the fact that the information outputted by show functions is inconsistent in at least 4 functions.

show functions | where name == \"coalesce\" or name == \"case\" or name == \"greatest\" or name == \"least\" or name == \"concat\" | keep name, variadic, optionalArgs, arg*

     name      |   variadic    | optionalArgs  |        argNames         |                                                 argTypes                                                  |                   argDescriptions                    
---------------+---------------+---------------+-------------------------+-----------------------------------------------------------------------------------------------------------+------------------------------------------------------
case           |true           |[false, false] |[condition, rest]        |[boolean, boolean|cartesian_point|date|double|geo_point|integer|ip|keyword|long|text|unsigned_long|version]|[, ]                                                  
coalesce       |true           |[false, false] |[expression, expressionX]|[boolean|text|integer|keyword|long, boolean|text|integer|keyword|long]                                     |[Expression to evaluate, Other expression to evaluate]
concat         |true           |[false, false] |[first, rest]            |[keyword|text, keyword|text]                                                                               |[, ]                                                  
greatest       |true           |[false, false] |[first, rest]            |[integer|long|double|boolean|keyword|text|ip|version, integer|long|double|boolean|keyword|text|ip|version] |[, ]                                                  
least          |true           |[false, false] |[first, rest]            |[integer|long|double|boolean|keyword|text|ip|version, integer|long|double|boolean|keyword|text|ip|version] |[, ]                                                  
  1. case has at least two required arguments. So, optionalArgs is correctish. But argNames is not: [condition, rest], it should be [condition, value, rest]. If argNames has three values ("rest" representing the "variadic" part of the arguments) then optionalArgs should be [false, false, true].
  2. concat has one required arguments. optionalArgs is wrong, as well: if it shows two arguments, then the second one should be true (the variadic, optional part).
  3. coalesce has one required argument. This means that [false, false] is wrong as well. It should either be [false] or [false, true] (to follow the logic from the two statements above). The same wrong output is for greatest and least.
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 14, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@fang-xing-esql
Copy link
Member

Coalesce, greatest, least and concat are more straightforward, the second input parameter can be made optional, if it is not provided, the function will return the first.

CASE is different and probably worth a bit more discussion, as it requires at least two input parameters - condition and value, the second input cannot be empty. The result of show functions today is not incorrect, it is not clear enough. There are two options I can think of for CASE:

  1. Keep the number of input parameters as is - two and both are required, rename the second input parameter to valueAndRest, or a better name to indicate at least one item needs to be specified in there.
  2. Make CASE take three input parameters - required, required, optional. Today EsqlFunctionRegistry relies on the def methods in FunctionRegistry in ql to define a function, we can add a new def method there to accept two mandatory inputs followed by a veridic list.

As #106222 (comment) does not complain about the inconsistence for CASE, both options are OK to me. @astefan Could you let me know whether you prefer option 1 or 2? Thank you!

@dej611
Copy link
Contributor

dej611 commented Mar 18, 2024

I think the best approach for case there is to provide a better description with many examples. We can work on that, but it would be great to have the description coming from show/meta functions.

@astefan
Copy link
Contributor Author

astefan commented Mar 18, 2024

@fang-xing-esql I think the show functions output should look like this:

     name      |   variadic    | optionalArgs  |        argNames         |                                                 argTypes                                                  
---------------+---------------+---------------+-------------------------+-----------------------------------------------------------------------------------------------------------
case           |true           |[false, false] |[condition, value]       |[boolean, boolean|cartesian_point|date|double|geo_point|integer|ip|keyword|long|text|unsigned_long|version]
coalesce       |true           |[false]        |[expression]             |[boolean|text|integer|keyword|long]                                     
concat         |true           |[false, false] |[string1, string2]       |[keyword|text, keyword|text]                                                                               
greatest       |true           |[false]        |[first]                  |[integer|long|double|boolean|keyword|text|ip|version] 
least          |true           |[false]        |[first]                  |[integer|long|double|boolean|keyword|text|ip|version] 
ltrim          |false          |false          |str                      |keyword|text                                                                                               
round          |false          |[false, true]  |[value, decimals]        |[double, integer]                                                                                          
rtrim          |false          |false          |str                      |keyword|text                                                                                               
trim           |false          |false          |str                      |keyword|text

Meaning:

  • optionalArgs should contain only the non-variadic arguments. The value of variadic as true implies that there can be other arguments for that specific function.
  • argNames and argTypes should be in line with optionalArgs
  • the argument names should be, as much as possible, in line with those from the documentation. For example, in my table above for concat I used string1 and string2 (taken from documentation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo)
Projects
None yet
4 participants