[PATCH] improve searching under high concurrancy [LUCENE-1337] #2414

asfimport · 2008-07-16T17:23:58Z

I was trying to load test my web server and kept running into a condition were the web server would become unresponsive even though the load was below one. Turns out Lucene has synchronization blocks around reading the index. It appears this was only necassary to synchronize access to a descriptor which contains a RandomAccessFile and information about the state of this file. My solution was to use a pool of descriptors so that they could be reused on subsequent reads. During periods of low contention only one or a few Descriptors will be created, but under heavy loads many Descriptors can be created to avoid synchronization. After creating and applying my patch, I was able to triple my searching throughput and fully utilize the resources, the CPU's becoming the new bottleneck. My patch modifies FSDirectory directly, but I'm not entirely sure that's the proper implementation. I'd like to help resolve this synchronization issue for other lucene users, so please let me know how I can help.

Migrated from LUCENE-1337 by Brian Gardner, resolved Jul 17 2008
Environment:

Linux

Attachments: lucene.patch

The text was updated successfully, but these errors were encountered:

asfimport · 2008-07-16T17:32:06Z

Brian Gardner (migrated from JIRA)

This patch applies to version 2.3.1

asfimport · 2008-07-16T17:37:29Z

Yonik Seeley (@yonik) (migrated from JIRA)

Thanks Brian, also see #1828 for more history and a bunch of options.

asfimport · 2008-07-17T17:40:19Z

Michael McCandless (@mikemccand) (migrated from JIRA)

Duplicate of #1828.

asfimport · 2008-07-17T18:13:17Z

Jason Rutherglen (migrated from JIRA)

The problem is the same but the solution is not. Do they each need separate patches listing more specifically how they solved the problem? Each solution has pluses and minuses. The NIOFSDirectory doesn't work on Windows. DescriptorsFSDirectory will on many Lucene installations quickly max out the file descriptors.

I would like to see both committed to trunk. MMapDirectory is in the trunk and it has limitations as well, mainly that (at least how I understand it) loads the all the files into ram.

asfimport · 2008-07-17T18:41:02Z

Michael McCandless (@mikemccand) (migrated from JIRA)

Jason are you thinking of #1492 (NIOFSDirectory)?

asfimport · 2008-07-17T20:09:41Z

Jason Rutherglen (migrated from JIRA)

Yonik checked in a modification of FSDirectory into #1828. I took that code and made NIOFSDirectory which is standalone so that it can be committed. It is checked into #1828 as lucene-753.patch.

asfimport · 2008-07-19T09:58:23Z

Michael McCandless (@mikemccand) (migrated from JIRA)

Yonik checked in a modification of FSDirectory into #1828. I took that code and made NIOFSDirectory which is standalone so that it can be committed. It is checked into #1828 as lucene-753.patch.

OK. I think ? it's a good idea to separately offer an FSDirectory implementation that uses positional reads (via FileChannel) to avoid synchronization.

I'd also like to somehow make that implementation the default on those platforms (all except windows?) where there are clear concurrency gains. Ie, maybe change FSDirectory.getDirectory to return NIOFSDirectory if it's not on windows, but also offer a getDirectory that takes the IMPL so you can force it to pick a different IMPL. In general I think Lucene should default to good out of the box performance, ie, without requiring special knowledge/tuning on the user's part, so long as there's no difficult tradeoff.

Though we probably should change the name to something less generic than "nio", though I can't think of an alternative offhand.

But one question: it looks like NIOFSIndexInput copies most of BufferedIndexInput source rather than subclassing – why was that? Can we change that back to a subclass, perhaps opening up members of BufferedIndexInput a bit if necessary?

asfimport closed this as completed Jul 17, 2008

asfimport mentioned this issue Aug 24, 2022

Use NIO positional read to avoid synchronization in FSIndexInput [LUCENE-753] #1828

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PATCH] improve searching under high concurrancy [LUCENE-1337] #2414

[PATCH] improve searching under high concurrancy [LUCENE-1337] #2414

asfimport commented Jul 16, 2008

asfimport commented Jul 16, 2008

asfimport commented Jul 16, 2008 •

edited

Loading

asfimport commented Jul 17, 2008 •

edited

Loading

asfimport commented Jul 17, 2008

asfimport commented Jul 17, 2008 •

edited

Loading

asfimport commented Jul 17, 2008 •

edited

Loading

asfimport commented Jul 19, 2008 •

edited

Loading

[PATCH] improve searching under high concurrancy [LUCENE-1337] #2414

[PATCH] improve searching under high concurrancy [LUCENE-1337] #2414

Comments

asfimport commented Jul 16, 2008

asfimport commented Jul 16, 2008

asfimport commented Jul 16, 2008 • edited Loading

asfimport commented Jul 17, 2008 • edited Loading

asfimport commented Jul 17, 2008

asfimport commented Jul 17, 2008 • edited Loading

asfimport commented Jul 17, 2008 • edited Loading

asfimport commented Jul 19, 2008 • edited Loading

asfimport commented Jul 16, 2008 •

edited

Loading

asfimport commented Jul 17, 2008 •

edited

Loading

asfimport commented Jul 17, 2008 •

edited

Loading

asfimport commented Jul 17, 2008 •

edited

Loading

asfimport commented Jul 19, 2008 •

edited

Loading