Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lazy-init LZ4-HC hashtable in blocktreewriter [LUCENE-9816] #10855

Closed
asfimport opened this issue Feb 28, 2021 · 7 comments
Closed

lazy-init LZ4-HC hashtable in blocktreewriter [LUCENE-9816] #10855

asfimport opened this issue Feb 28, 2021 · 7 comments

Comments

@asfimport
Copy link

Based upon the data for a field, blocktree may compress with LZ4-HC (or with simple lowercase compression or none at all).

But we currently eagerly initialize HC hashtable (132k) for each field regardless of whether it will be even "tried". This shows up as top cpu and heap hotspot when profiling tests. It creates unnecessary overhead for small flushes.


Migrated from LUCENE-9816 by Robert Muir (@rmuir), resolved Feb 28 2021
Attachments: LUCENE-9816.patch

@asfimport
Copy link
Author

Robert Muir (@rmuir) (migrated from JIRA)

Attached patch, I don't see the LZ4-HC stuff when profiling tests anymore. I think the logic @jpountz added here is enough to avoid even trying to use this compression when flushing if it stands no chance to benefit.

@asfimport
Copy link
Author

Michael McCandless (@mikemccand) (migrated from JIRA)

Whoa, thank you JFR!  I love how simple the patch is – just moving to lazy init.  +1

And it is disappointing this LZ4 hash table allocates 128 KB up front like that!?

@asfimport
Copy link
Author

Adrien Grand (@jpountz) (migrated from JIRA)

+1 to the patch

@mikemccand This is due to how the algorithm looks for duplicates, it stores a large hash table that maps 4-bytes sequences to offsets in the input.

The high-speed variant uses packed ints to lower memory usage on short inputs, but maybe we could be smarter on the high-compression variant: as it not only records the last offset for every hash like the high-speed variant, but also the previous ones up to a window of 64k, a strong hash function only makes compression faster, not more effective. So maybe we could adapt the number of bits of the hash function depending on the size of the input in order to reduce memory usage, without hurting compression ratios and hopefully without hurting compression speed.

@asfimport
Copy link
Author

Robert Muir (@rmuir) (migrated from JIRA)

The patch takes care of the tests issue (and maybe some similar extreme use-case). The tests (and maybe some users) just like to index a lot of fields, and to my knowledge this is the only place in the default codec where we are doing LZ4-HC.

Due to the way we write (field at a time), maybe it doesn't help to super-optimize the hashtable memory usage? For example it does not show up in luceneutil benchmarks at all.

@asfimport
Copy link
Author

ASF subversion and git services (migrated from JIRA)

Commit dade99c in lucene-solr's branch refs/heads/master from Robert Muir
https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=dade99c

LUCENE-9816: lazy-init LZ4-HC hashtable in BlockTreeTermsWriter

LZ4-HC hashtable is heavy (128kb int[] + 128kb short[]) and must be
filled with special values on initialization. This is a lot of overhead
for fields that might not use the compression at all.

Don't initialize this for a field until we see hints that the data might
be compressible and need to use the table in order to test it out.

@asfimport
Copy link
Author

Michael McCandless (@mikemccand) (migrated from JIRA)

[~mikemccand] This is due to how the algorithm looks for duplicates, it stores a large hash table that maps 4-bytes sequences to offsets in the input.

+1, thanks for the explanation and musings about how we might further optimize it @jpountz!

@asfimport
Copy link
Author

Adrien Grand (@jpountz) (migrated from JIRA)

Closing after the 9.0.0 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant