-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor segmentInfos from IndexReader into its subclasses [LUCENE-986] #2062
Comments
Chris M. Hostetter (@hossman) (migrated from JIRA) one aspect of this that should be considered: It may not make sense for MultiReader to extend MultiSegmentReader ... as Michael says, only subclasses that own the index directory should have segmentInfos, and a MultiReader (as defined on the trunk now) can never own it's own directory. I haven't worked through all the implications, but perhaps the most logical refactoring would be...
there would likely be some utlity functionality that could be reused between MultiReader and MultiSegmentReader ... possible as static methods in IndexReader (or a new util class) |
Michael Busch (migrated from JIRA) This is good stuff, Hoss. I like the DirectoryIndexReader idea. I'll work through the code to understand all consequences. |
Michael Busch (migrated from JIRA) What do you think about this alternative approach:
The advantage here is that only one class (MultiSegmentReader) is If we go the DirectoryIndexReader way (where SegmentReader and So I'm not sure which approach is the better one. I'm hoping to get some |
Chris M. Hostetter (@hossman) (migrated from JIRA) i rarely remember a week later what i was thinking when i wrote something, but i suspect that when i suggested the DirectoryIndexReader i was assuming it would have everything directory/lock related thta currently exists in the IndexReader base class (including the directoryOwner boolean) ... in cases where there is a single Segment in a directory, there will be SegmentReader with directoryOwner==true ... in the multi segment cases, the MultiSegmentReader will have directoryOwner==true, and it's sub SegmentReaders will all have directoryOwner==false. ... ...the key point of DirectoryIndexReader being that any subclass can own a directory (and automaticly inherits all the code for dealing with locks properly when it needs/wants to) but doesn't always have to own the directory. meanwhile MultiReader (and ParallelIndexReader and FilteredIndexReader) make no attempt at owning a directory, and inherit no code for doing so (or for dealing with the locking of such non existent directories) I don't really know enough about the performance characteristics of SegmentReader vs a MultiSegmentReader of one segment to have a sense of how possible/practical it would be to eliminate the need for SegmentReader and replace it completely with MultiSegmentReader ... one hitch might be that SegmentReader.get is public, and in order to keep supporting it, SegmentReader still needs to have/inherit the same segment info and directory owning/locking code that we want to move out of IndexReader (so just putting it MultiSegmentReader won't fly unless we kill that public method) |
Michael Busch (migrated from JIRA) Here is the patch with the DirectoryIndexReader approach. It moves SegmentInfos and Directory from IndexReader into MultiSegmentReader and SegmentReader extend DirectoryIndexReader. I added the method acquireWriteLock() to IndexReader that does IndexReader is very abstract now and almost all logic moved into All unit tests pass with this patch. |
Michael Busch (migrated from JIRA) > one hitch might be that SegmentReader.get is public, and in order to keep OK, I implemented the DirectoryIndexReader approach. Also because I'm not sure I'd like to commit this rather soon. A review of the patch would be highly |
Chris M. Hostetter (@hossman) (migrated from JIRA) Michael: I've been meaning to look at this, but haven't had the time ... your recent update has goaded me :) just to clarify: the patch you added on September 12th is your latest patch right? ... it's not clear from you comment on the 17th if you intended to attach an update and something went wrong. I ask because i'm haivng trouble applying the patch from the 12th ... i must be tired because i can't understand why, it doesn't look like the files have changed since you posted the patch, so i'm not sure what it's complaining about ... visually everything seems to match up... hossman@coaster:~/lucene/lucene$ svn update |
Chris M. Hostetter (@hossman) (migrated from JIRA) I got the patch to apply cleanly (see mailing list for details) On the whole it looks really good, i'm attaching an updated version with some minor improvements (mainly javadocs), but first a few questions...
here's the list of tweaks I made...
|
Michael Busch (migrated from JIRA) > I got the patch to apply cleanly (see mailing list for details) Thanks, Hoss! I'm using TortoiseSVN, I have to check how to set those > * just to clarify: IndexReader(Directory) is only around for Yes, correct. > * IndexReader() says it can be removed once the other constructor is Yes, just a suggested simplification. Keeping the constructor wouldn't hurt > * since one of the goals is that IndexReaders which don't own their Sounds good, will do... > * the way TestMultiReader is setup with the "mode" is a bit confusing Yes, that's cleaner, will make that change as well. > here's the list of tweaks I made... The improvements look good to me. |
Michael Busch (migrated from JIRA) In addition to Hoss' changes this patch:
I'm planning to commit this in a day or so if nobody objects. |
Michael Busch (migrated from JIRA) Committed Rev. 577596 |
References to segmentInfos in IndexReader cause different kinds of problems
for subclasses of IndexReader, like e. g. MultiReader.
Only subclasses of IndexReader that own the index directory, namely
SegmentReader and MultiSegmentReader, should have a SegmentInfos object
and be able to access it.
Further information:
http://www.gossamer-threads.com/lists/lucene/java-dev/51808
http://www.gossamer-threads.com/lists/lucene/java-user/52460
A part of the refactoring work was already done in #1856
Migrated from LUCENE-986 by Michael Busch, 1 vote, resolved Sep 20 2007
Attachments: lucene-986.patch (versions: 3)
Linked issues:
The text was updated successfully, but these errors were encountered: