Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

Wildcard syntax problem #368

Closed
aetter opened this issue Feb 17, 2020 · 8 comments
Closed

Wildcard syntax problem #368

aetter opened this issue Feb 17, 2020 · 8 comments
Assignees
Labels
bug Something isn't working SQL

Comments

@aetter
Copy link
Contributor

aetter commented Feb 17, 2020

See opendistro/for-elasticsearch-docs#177

With the normal elasticsearch docker install (not ODFE), the following returns me what I want (phrases starting with 'graphic')
query="SELECT * FROM company_1 WHERE name LIKE 'graphic%'"
example results: graphic company 1, graphic llc

With ODFE, the same query returns me any words containing 'graphic'. It just seems the like statement isn't performing the sql like it should.
example results: alpha graphic inc, california graphics corporation

@abbashus
Copy link
Contributor

Thanks @aetter , we will have a look.

@dai-chen dai-chen added the question Further information is requested label Feb 17, 2020
@dai-chen
Copy link
Member

Hi @mlrlite, could you post your index mapping and several documents so we can reproduce the issue from our side?

@aetter
Copy link
Contributor Author

aetter commented Feb 17, 2020

Hi @dai-chen, try indexing these:

curl -XPOST https://localhost:9200/company_1/_doc -H 'Content-Type: application/json' -d '{"name": "graphic llc"}' -u admin:admin -k
curl -XPOST https://localhost:9200/company_1/_doc -H 'Content-Type: application/json' -d '{"name": "graphic company 1"}' -u admin:admin -k
curl -XPOST https://localhost:9200/company_1/_doc -H 'Content-Type: application/json' -d '{"name": "alpha graphic inc"}' -u admin:admin -k
curl -XPOST https://localhost:9200/company_1/_doc -H 'Content-Type: application/json' -d '{"name": "california graphics corporation"}' -u admin:admin -k

Then run:

POST https://localhost:9200/_opendistro/_sql
{
  "query": "SELECT * FROM company_1 WHERE name LIKE 'graphic%'"
}

curl is doing something weird on the encoding, so I'm just using the shorthand, but I assume you could also get it going with curl.

@dai-chen
Copy link
Member

@aetter Thanks! Will try it out.

@dai-chen
Copy link
Member

dai-chen commented Feb 17, 2020

@aetter and @mlrlite, I can reproduce from my side now. For your context, there is a rewriter in our codebase to rewrite field name in your query if it is a multi-field (ex. keyword inside text). So your query should be rewritten to WHERE name.keyword LIKE 'graphic%' which works as you expect. Unfortunately, we didn't cover this case so your original query is executed as it is.

Code related:

).anyMatch(operator -> operator == ((SQLBinaryOpExpr) expr).getOperator());

We will fix the issue and for now you can do SELECT * FROM company_1 WHERE name.keyword LIKE 'graphic%'. Let me know if this doesn't work or yours is actually different case.

Here is how I reproduced the issue with our latest code in Kibana:

PUT company_1/_doc/1
{
  "name": "graphic llc"
}

PUT company_1/_doc/2
{
  "name": "graphic company 1"
}

PUT company_1/_doc/3
{
  "name": "alpha graphic inc"
}

PUT company_1/_doc/4
{
  "name": "california graphics corporation"
}
GET company_1/_mappings

{
  "company_1" : {
    "mappings" : {
      "properties" : {
        "name" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }
          }
        }
      }
    }
  }
}
POST _opendistro/_sql
{
  "query": "SELECT * FROM company_1 WHERE name LIKE 'graphic%'"
}
{
  "schema": [{
    "name": "name",
    "type": "text"
  }],
  "total": 4,
  "datarows": [
    ["california graphics corporation"],
    ["graphic company 1"],
    ["alpha graphic inc"],
    ["graphic llc"]
  ],
  "size": 4,
  "status": 200
}
POST _opendistro/_sql
{
  "query": "SELECT * FROM company_1 WHERE name.keyword LIKE 'graphic%'"
}
{
  "schema": [{
    "name": "name",
    "type": "text"
  }],
  "total": 2,
  "datarows": [
    ["graphic company 1"],
    ["graphic llc"]
  ],
  "size": 2,
  "status": 200
}

@dai-chen dai-chen added bug Something isn't working good first issue Good for newcomers and removed question Further information is requested labels Feb 17, 2020
@mlrlite
Copy link

mlrlite commented Feb 17, 2020

@dai-chen thank you, I will try this and get back with you, thanks for looking into this!

@penghuo
Copy link
Contributor

penghuo commented Apr 3, 2020

Another similar issue.

PUT issue/_doc/1
{
  "type":"issue_snapshots:SSLError"
}

GET issue/_mapping
{
  "issue" : {
    "mappings" : {
      "properties" : {
        "type" : {
          "type" : "text",
          "fields" : {
            "keyword" : {
              "type" : "keyword",
              "ignore_above" : 256
            }}}}}}}

Query on type.keyword success

GET /issue/_search
{
  "from": 0,
  "size": 20,
  "query": {
      "wildcard": {
          "type.keyword": {
              "value": "issue_snapshots:S*",
              "boost": 1.0
          }
    }
  }
}

{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "issue",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "type" : "issue_snapshots:SSLError"
        }
      }
    ]
  }
}

Query on type return empty result

GET /issue/_search
{
  "from": 0,
  "size": 20,
  "query": {
      "wildcard": {
          "type": {
              "value": "issue_snapshots:S*",
              "boost": 1.0
          }
      }
  }
}

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  }
}

Why wildcard query on text field return empty result?

The text field query based on token. the "issue_snapshots:SSLError" has been analyzed as "issue_snapshots:sslerror". so any expected string "issue_snapshots:S*" doesn't match it.

GET /_analyze
{
  "analyzer" : "standard",
  "text" : "issue_snapshots:SSLError"
}

{
  "tokens" : [
    {
      "token" : "issue_snapshots:sslerror",
      "start_offset" : 0,
      "end_offset" : 24,
      "type" : "<ALPHANUM>",
      "position" : 0
    }
  ]
}

@dai-chen
Copy link
Member

This is already fixed in new query engine. However, the new engine is disabled by default for now. You can enable it by following the instruction here: https://github.com/opendistro-for-elasticsearch/sql/blob/master/docs/user/admin/settings.rst#opendistro-sql-engine-new-enabled

@dai-chen dai-chen assigned dai-chen and unassigned zhongnansu Aug 19, 2020
@dai-chen dai-chen added SQL and removed good first issue Good for newcomers labels Aug 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working SQL
Projects
None yet
Development

No branches or pull requests

6 participants