Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allocate docIds in a way that is monotonic with some fast field #29

Closed
fulmicoton opened this issue Sep 20, 2016 · 5 comments
Closed

Allocate docIds in a way that is monotonic with some fast field #29

fulmicoton opened this issue Sep 20, 2016 · 5 comments

Comments

@fulmicoton
Copy link
Collaborator

Sorting documents by a given field can open the door to various optimization

  • sort by score related metric
  • group by/collapse by a sorted field
  • locality
  • range queries
@fulmicoton fulmicoton added this to the 0.2.0 milestone Oct 18, 2016
@fulmicoton fulmicoton modified the milestones: 0.2.1, 0.2.0 Dec 12, 2016
@fulmicoton fulmicoton modified the milestones: 0.4, 0.2.1 Apr 8, 2017
@fulmicoton fulmicoton removed this from the 0.4.0 milestone Jul 14, 2017
@fulmicoton fulmicoton changed the title Sort document by a given fast field Sort documents by a given fast field Aug 14, 2018
@druzn3k
Copy link

druzn3k commented Oct 8, 2018

I want to work on this (as soon as the other PR is closed, of course), could you give me some pointer for where to start?

@hwchen
Copy link

hwchen commented Feb 8, 2019

@fulmicoton can you explain a little how this would differ from the top docs by field collector?

https://docs.rs/tantivy/0.8.1/tantivy/collector/struct.TopDocsByField.html

@fulmicoton
Copy link
Collaborator Author

@hwchen Sorry my previous message was misleading. I removed it.

Right now documents get an internal document id. This document id is currenlty defined by the order in which document are added in the segment.

There are a bunch of optimization that can be done if someone choose a better order for these.

For instance, ordering by decreasing page rank makes it possible to get a nice SERP while scoring only the first Q docs.

This is a very interesting feature for many business, but let's not consider it until somebody actually has the need and want to use tantivy.

@fulmicoton fulmicoton changed the title Sort documents by a given fast field Allocate docIds in a way that is monotonic with some fast field Feb 9, 2021
@fulmicoton fulmicoton added qw and removed low priority labels Feb 9, 2021
@shikhar
Copy link
Collaborator

shikhar commented Jun 28, 2021

Is this a dupe of #1014, which got closed with #1026? cc @PSeitz

@fulmicoton
Copy link
Collaborator Author

Correct . Thank you @shikhar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants