Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed of Lucene index is lost by expand function #7880

Closed
rdelangh opened this issue Nov 12, 2017 · 4 comments
Closed

Speed of Lucene index is lost by expand function #7880

rdelangh opened this issue Nov 12, 2017 · 4 comments
Assignees

Comments

@rdelangh
Copy link

OrientDB Version: 2.2.27

Java Version: n/a

OS: Ubuntu

Expected behavior

Using a Lucene provides very fast lookup of the rowids. Trying to get usefull information from the record-columns based on these rowids, should require only slightly more time.

Actual behavior

  1. the query from the index goes lightspeed fast:
    orientdb {db=cdrarch}> SELECT count(rid) FROM INDEX:idx_myindex WHERE key LUCENE ' columnA:2219 AND columnB:[2017110608 TO 2017110610] '

+----+-----+
|# |count|
+----+-----+
|0 |24830|
+----+-----+

1 item(s) found. Query executed in 0.468 sec(s).

  1. trying to get usefull information from the actual records, of which the rowids were found via the index, takes ages:
    orientdb {db=cdrarch}> select count(*) FROM (SELECT expand(rid) FROM INDEX:idx_myindex WHERE key LUCENE ' columnA:2219 AND columnB:[2017110608 TO 2017110610] ')
    +----+-----+
    |# |count|
    +----+-----+
    |0 |24830|
    +----+-----+

1 item(s) found. Query executed in 95.189 sec(s).

Steps to reproduce

@wolf4ood
Copy link
Member

Hi @rdelangh

this is because the

 SELECT count(rid) FROM INDEX:idx_myindex WHERE key LUCENE ' columnA:2219 AND columnB:[2017110608 TO 2017110610]

does not load the records from disk. Just use the lucene index for counting.

when you use this

select count(*) FROM (SELECT expand(rid) FROM INDEX:idx_myindex WHERE key LUCENE ' columnA:2219 AND columnB:[2017110608 TO 2017110610] ')

24830 records are loaded

@wolf4ood wolf4ood self-assigned this Nov 13, 2017
@rdelangh
Copy link
Author

hi Enrico,
my example was only meant to indicate the difference of speed when "expand()" must be used to access extra fields from the records (which is a typical requirement, I guess), and not only to count the results (which indeed can be done simply on the index).

@wolf4ood
Copy link
Member

Hi @rdelangh

can you attach here the explain of

select count(*) FROM (SELECT expand(rid) FROM INDEX:idx_myindex WHERE key LUCENE ' columnA:2219 AND columnB:[2017110608 TO 2017110610] ')

@rdelangh
Copy link
Author

hello @maggiolo00
here you go:
`orientdb {db=mydb}> explain SELECT count(*) FROM ( SELECT expand(rid) FROM INDEX:idx_myindex WHERE key LUCENE ' columnA:2219 AND columnB:[2017103008 TO 2017103010] ')

Profiled command '{projectionElapsed:0,optimizationElapsed:148849,expandElapsed:25,user:#5:0,tips:[1],elapsed:148848.98,resultType:collection,resultSize:1}' in 148.865005 sec(s):
{"@type":"d","@Version":0,"projectionElapsed":0,"optimizationElapsed":148849,"expandElapsed":25,"user":"#5:0","tips":["Query 'SELECT expand(rid) FROM INDEX:idx_myindex WHERE key LUCENE ' columnA:2219 AND columnB:[2017103008 TO 2017103010] '' returned a result set with more than 10000 records. Check if you really need all these records, or reduce the resultset by using a LIMIT to improve both performance and used RAM"],"elapsed":148848.98,"resultType":"collection","resultSize":1,"@fieldTypes":"projectionElapsed=l,optimizationElapsed=l,expandElapsed=l,user=x,elapsed=f"}`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants