forked from apache/lucene
-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathCHANGES.txt
19300 lines (13851 loc) · 838 KB
/
CHANGES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
Lucene Change Log
For more information on past and future Lucene versions, please see:
http://s.apache.org/luceneversions
======================= Lucene 10.0.0 =======================
API Changes
---------------------
* LUCENE-12092: Remove deprecated UTF8TaxonomyWriterCache. Please use LruTaxonomyWriterCache
instead. (Vigya Sharma)
* LUCENE-10010: AutomatonQuery, CompiledAutomaton, RunAutomaton, RegExp
classes no longer determinize NFAs. Instead it is the responsibility
of the caller to determinize. (Robert Muir)
* LUCENE-10368: IntTaxonomyFacets has been make pkg-private and serves only as an internal
implementation detail of taxonomy-faceting. (Greg Miller)
* LUCENE-10400: Remove deprecated dictionary constructors in Kuromoji and Nori (Tomoko Uchida)
* LUCENE-10440: TaxonomyFacets and FloatTaxonomyFacets have been made pkg-private and only serve
as internal implementation details of taxonomy-faceting. (Greg Miller)
* LUCENE-10431: MultiTermQuery.setRewriteMethod() has been removed. (Alan Woodward)
* LUCENE-10436: Remove deprecated DocValuesFieldExistsQuery, NormsFieldExistsQuery and
KnnVectorFieldExistsQuery. (Zach Chen, Adrien Grand)
* LUCENE-10561: Reduce class/member visibility of all normalizer and stemmer classes. (Rushabh Shah)
* LUCENE-10266: Move nearest-neighbor search on points to core. (Rushabh Shah)
* LUCENE-10603: Remove SortedSetDocValues#NO_MORE_ORDS definition. (Greg Miller)
* GITHUB#11813: Remove Operations.isFinite: the recursive implementation could be problematic
for large automatons (WildcardQuery, PrefixQuery, RegExpQuery, etc). (taroplus, Robert Muir)
* GITHUB#11840: Query rewrite now takes an IndexSearcher instead of IndexReader to enable concurrent
rewriting. (Patrick Zhai)
* GITHUB#11933: Remove IOContext from Directory#openChecksumInput. (Zach Chen)
* GITHUB#11814: Support deletions in IndexRearranger. (Stefan Vodita)
* GITHUB#12107: Remove deprecated KnnVectorField, KnnVectorQuery, VectorValues and
LeafReader#getVectorValues. (Luca Cavanna)
* GITHUB#12296: Make IndexReader and IndexReaderContext classes explicitly sealed.
They have already been runtime-checked to only be implemented by the specific classes
so this is effectively a non-breaking change.
* GITHUB#12276: Rename DaciukMihovAutomatonBuilder to StringsToAutomaton
* GITHUB#12321: Reduced visibility of StringsToAutomaton. Please use Automata#makeStringUnion instead. (Greg Miller)
New Features
---------------------
* LUCENE-10010 Introduce NFARunAutomaton to run NFA directly. (Patrick Zhai)
* LUCENE-10626 Hunspell: add tools to aid dictionary editing:
analysis introspection, stem expansion and stem/flag suggestion (Peter Gromov)
Improvements
---------------------
* LUCENE-10416: Update Korean Dictionary to mecab-ko-dic-2.1.1-20180720 for Nori.
(Uihyun Kim)
* LUCENE-10614: Properly support getTopChildren in RangeFacetCounts. (Yuting Gan)
* LUCENE-10652: Add a top-n range faceting example to RangeFacetsExample. (Yuting Gan)
Optimizations
---------------------
* GITHUB#11857, GITHUB#11859, GITHUB#11893, GITHUB#11909: Hunspell: improved suggestion performance (Peter Gromov)
Bug Fixes
---------------------
* LUCENE-10599: LogMergePolicy is more likely to keep merging segments until
they reach the maximum merge size. (Adrien Grand)
* GITHUB#12220: Hunspell: disallow hidden title-case entries from compound middle/end
Other
---------------------
* LUCENE-10376: Roll up the loop in VInt/VLong in DataInput. (Guo Feng)
* LUCENE-10283: The minimum required Java version was bumped from 11 to 17.
(Adrien Grand, Uwe Schindler, Dawid Weiss, Robert Muir)
* LUCENE-10253: The @BadApple annotation has been removed from the test
framework. (Adrien Grand)
* LUCENE-10393: Unify binary dictionary and dictionary writer in Kuromoji and Nori.
(Tomoko Uchida, Robert Muir)
* LUCENE-10475: Merge dictionary builders in `util` package into `dict` package in Kuromoji and Nori.
All classes in `org.apache.lucene.analysis.[ja|ko].util` was moved to `org.apache.lucene.analysis.[ja|ko].dict`.
(Tomoko Uchida)
* LUCENE-10493: Factor out Viterbi algorithm in Kuromoji and Nori to analysis-common. (Tomoko Uchida)
* GITHUB#977, LUCENE-9500: Remove the deflater hack introduced because of JDK-8252739 (Uwe Schindler)
* GITHUB#11960: Hunspell: supported empty dictionaries (Peter Gromov)
* GITHUB#12239: Hunspell: reduced suggestion set dependency on the hash table order (Peter Gromov)
======================== Lucene 9.7.0 =======================
API Changes
---------------------
* GITHUB#11840, GITHUB#12304: Query rewrite now takes an IndexSearcher instead of
IndexReader to enable concurrent rewriting. Please note: This is implemented in
a backwards compatible way. A query overriding any of both rewrite methods is
supported. To implement this backwards layer in Lucene 9.x the
RuntimePermission "accessDeclaredMembers" is needed in applications using
SecurityManager. (Patrick Zhai, Ben Trent, Uwe Schindler)
* GITHUB#12321: DaciukMihovAutomatonBuilder has been marked deprecated in preparation of reducing its visibility in
a future release. (Greg Miller)
* GITHUB#12268: Add BitSet.clear() without parameters for clearing the entire set
(Jonathan Ellis)
* GITHUB#12346: add new IndexWriter#updateDocuments(Query, Iterable<Document>) API
to update documents atomically, with respect to refresh and commit using a query. (Patrick Zhai)
New Features
---------------------
* GITHUB#12257: Create OnHeapHnswGraphSearcher to let OnHeapHnswGraph to be searched in a thread-safety manner. (Patrick Zhai)
* GITHUB#12302, GITHUB#12311: Add vectorized implementations of VectorUtil.dotProduct(),
squareDistance(), cosine() with Java 20 jdk.incubator.vector APIs. Applications started
with command line parameter "java --add-modules jdk.incubator.vector" on exactly Java 20
will automatically use the new vectorized implementations if running on a supported platform
(x86 AVX2 or later, ARM NEON). This is an opt-in feature and requires explicit Java
command line flag! When enabled, Lucene logs a notice using java.util.logging. Please test
thoroughly and report bugs/slowness to Lucene's mailing list.
(Chris Hegarty, Robert Muir, Uwe Schindler)
Improvements
---------------------
* GITHUB#12245: Add support for Score Mode to `ToParentBlockJoinQuery` explain. (Marcus Eagan via Mikhail Khludnev)
* GITHUB#12305: Minor cleanup and improvements to DaciukMihovAutomatonBuilder. (Greg Miller)
* GITHUB#12325: Parallelize AbstractKnnVectorQuery rewrite across slices rather than segments. (Luca Cavanna)
* GITHUB#12333: NumericLeafComparator#competitiveIterator makes better use of a "search after" value when paginating.
(Chaitanya Gohel)
* GITHUB#12290: Make memory fence in ByteBufferGuard explicit using `VarHandle.fullFence()`
* GITHUB#12320: Add "direct to binary" option for DaciukMihovAutomatonBuilder and use it in TermInSetQuery#visit.
(Greg Miller)
Optimizations
---------------------
* GITHUB#12270 Don't generate stacktrace in CollectionTerminatedException. (Armin Braun)
* GITHUB#12160: Concurrent rewrite for AbstractKnnVectorQuery. (Kaival Parikh)
* GITHUB#12286 Toposort use iterator to avoid stackoverflow. (Tang Donghai)
* GITHUB#12235: Optimize HNSW diversity calculation. (Patrick Zhai)
* GITHUB#12328: Optimize ConjunctionDISI.createConjunction (Armin Braun)
Bug Fixes
---------------------
* GITHUB#12291: Skip blank lines from stopwords list. (Jerry Chin)
Other
---------------------
(No changes)
======================== Lucene 9.6.0 =======================
API Changes
---------------------
* GITHUB#12116: Introduce IndexableField#storedValue() to expose the value that
should be stored to IndexingChain without needing to guess the field's type.
(Adrien Grand, Robert Muir)
* GITHUB#12129: Move DocValuesTermsQuery from sandbox to SortedDocValuesField#newSlowSetQuery
and SortedSetDocValuesField#newSlowSetQuery. (Robert Muir)
* GITHUB#12173: TermInSetQuery#getTermData has been deprecated. This exposes internal implementation details that we
may want to change in the future, and users shouldn't rely on the encoding directly. (Greg Miller)
* GITHUB#11746: Deprecate LongValueFacetCounts#getTopChildrenSortByCount. (Greg Miller)
New Features
---------------------
* GITHUB#12054: Introduce a new KeywordField for simple and efficient
filtering, sorting and faceting. (Adrien Grand)
* GITHUB#12188: Add support for Java 20 foreign memory API. If exactly Java 19
or 20 is used, MMapDirectory will mmap Lucene indexes in chunks of 16 GiB
(instead of 1 GiB) and indexes closed while queries are running can no longer
crash the JVM. To disable this feature, pass the following sysprop on Java command line:
"-Dorg.apache.lucene.store.MMapDirectory.enableMemorySegments=false" (Uwe Schindler)
* GITHUB#12169: Introduce a new token filter to expand synonyms based on Word2Vec DL4j models. (Daniele Antuzi, Ilaria Petreti, Alessandro Benedetti)
Improvements
---------------------
* GITHUB#12055: MultiTermQuery#CONSTANT_SCORE_BLENDED_REWRITE rewrite method introduced and used as the new default
for multi-term queries with a FILTER rewrite (PrefixQuery, WildcardQuery, TermRangeQuery). This introduces better
skipping support for common use-cases. (Adrien Grand, Greg Miller)
* GITHUB#12156: TermInSetQuery now extends MultiTermQuery instead of providing its own custom implementation (which
was essentially a clone of MultiTermQuery#CONSTANT_SCORE_REWRITE). It uses the new CONSTANT_SCORE_BLENDED_REWRITE
by default, but can be overridden through the constructor. (Greg Miller)
* GITHUB#12175: Remove SortedSetDocValuesSetQuery in favor of TermInSetQuery with DocValuesRewriteMethod. (Greg Miller)
* GITHUB#12166: Remove the now unused class pointInPolygon. (Marcus Eagan via Christine Poerschke and Nick Knize)
* GITHUB#12126: Refactor part of IndexFileDeleter and ReplicaFileDeleter into a public common utility class
FileDeleter. (Patrick Zhai)
Optimizations
---------------------
* GITHUB#11900: BloomFilteringPostingsFormat now uses multiple hash functions
in order to achieve the same false positive probability with less memory.
(Jean-François Boeuf)
* GITHUB#12118: Optimize FeatureQuery to TermQuery & weight when scoring is not required. (Ben Trent, Robert Muir)
* GITHUB#12128, GITHUB#12133: Speed up docvalues set query by making use of sortedness. (Robert Muir, Uwe Schindler)
* GITHUB#12050: Reuse HNSW graph for intialization during merge (Jack Mazanec)
* GITHUB#12155: Speed up DocValuesRewriteMethod by making use of sortedness. (Greg Miller)
* GITHUB#12139: Faster indexing of string fields. (Adrien Grand)
* GITHUB#12179: Better PostingsEnum reuse in MultiTermQueryConstantScoreBlendedWrapper. (Greg Miller)
* GITHUB#12198, GITHUB#12199: Reduced contention when indexing with many threads. (Adrien Grand)
* GITHUB#12241: Add ordering of files in compound files. (Christoph Büscher)
Bug Fixes
---------------------
* GITHUB#12158: KeywordField#newSetQuery should clone input BytesRef[] to avoid modifying provided array. (Greg Miller)
* GITHUB#12196: Fix MultiFieldQueryParser to handle both query boost and phrase slop at the same time. (Jasir KT)
* GITHUB#12202: Fix MultiFieldQueryParser to apply boosts to regexp, wildcard, prefix, range, fuzzy queries. (Jasir KT)
* GITHUB#12178: Add explanations for TermAutomatonQuery (Marcus Eagan via Patrick Zhai, Mike McCandless, Robert Muir, Mikhail Khludnev)
* GITHUB#12214: Fix ordered intervals query to avoid skipping some of the results over interleaved terms. (Hongyu Yan)
* GITHUB#12212: Bug fix for a DrillSideways issue where matching hits could occasionally be missed. (Frederic Thevenet)
* GITHUB#12220: Hunspell: disallow hidden title-case entries from compound middle/end (Peter Gromov)
* GITHUB#12260: Fix SynonymQuery equals implementation to take the targeted field name into account (Luca Cavanna)
Build
---------------------
* GITHUB#12131: Generate gradle.properties from gradlew, if absent (Colvin Cowie, Uwe Schindler)
* GITHUB#12188: Building the lucene-core MR-JAR file is now possible without installing
additionally required Java versions (Java 19, Java 20,...). For compilation, a special
JAR file with Panama-foreign API signatures of each supported Java version was added to
source tree. Those can be regenerated an demand with "gradlew :lucene:core:regenerate".
(Uwe Schindler)
* GITHUB#12215: Upgrade forbiddenapis to version 3.5. This tones down some verbose warnings
printed while checking Java 19 and Java 20 sourcesets for the MR-JAR. (Uwe Schindler)
Documentation
---------------------
* GITHUB#10633: Update javadocs in TestBackwardsCompatibility to use gradle and not ant. (Usman Shaikh)
Other
---------------------
* GITHUB#11868: Add a FilterIndexInput and FilterIndexOutput class to more easily and safely create delegate
IndexInput and IndexOutput classes (Marc D'Mello)
* GITHUB#12239: Hunspell: reduced suggestion set dependency on the hash table order (Peter Gromov)
======================== Lucene 9.5.0 =======================
API Changes
---------------------
* GITHUB#12093: Deprecate support for UTF8TaxonomyWriterCache and changed default to LruTaxonomyWriterCache.
Please use LruTaxonomyWriterCache instead. (Vigya Sharma)
* GITHUB#11998: Add new stored fields and termvectors interfaces: IndexReader.storedFields()
and IndexReader.termVectors(). Deprecate IndexReader.document() and IndexReader.getTermVector().
The new APIs do not rely upon ThreadLocal storage for each index segment, which can greatly
reduce RAM requirements when there are many threads and/or segments.
(Adrien Grand, Robert Muir)
* GITHUB#11742: MatchingFacetSetsCounts#getTopChildren now properly returns "top" children instead
of all children. (Greg Miller)
* GITHUB#11772: Removed native subproject and WindowsDirectory implementation from lucene.misc. Recommendation:
use MMapDirectory implementation on Windows. (Robert Muir, Uwe Schindler, Dawid Weiss)
* GITHUB#11804: FacetsCollector#collect is no longer final, allowing extension. (Greg Miller)
* GITHUB#11761: TieredMergePolicy now allowed a maximum allowable deletes percentage of down to 5%, and the default
maximum allowable deletes percentage is changed from 33% to 20%. (Marc D'Mello)
* GITHUB#11822: Configure replicator PrimaryNode replia shutdown timeout. (Steven Schlansker)
* GITHUB#11930: Added IOContext#LOAD for files that are a small fraction of the
total index size and heavily accessed with a random access pattern. Some
Directory implementations may choose to load files that use this IOContext in
memory to provide stronger guarantees on query latency.
(Adrien Grand, Uwe Schindler)
* GITHUB#11941: QueryBuilder#add and #newSynonymQuery methods now take a `field` parameter,
to avoid possible exceptions when building queries from an empty term list. The helper
TermAndBoost class now holds a BytesRef rather than a Term. (Alan Woodward)
* GITHUB#11961: VectorValues#EMPTY was removed as this instance was not
necessary and also illegal as it reported a number of dimensions equal to
zero. (Adrien Grand)
* GITHUB#11962: VectorValues#cost() now delegates to VectorValues#size().
(Adrien Grand)
* GITHUB#11984: Improved TimeLimitBulkScorer to check the timeout at exponantial rate.
(Costin Leau)
* GITHUB#12004: Add new KnnByteVectorQuery for querying vector fields that are encoded as BYTE. Removes the ability to
use KnnVectorQuery against fields encoded as BYTE (Ben Trent)
* GITHUB#11997: Introduce IntField, LongField, FloatField and DoubleField.
These new fields index both 1D points and sorted numeric doc values and
provide best performance for filtering and sorting.
(Francisco Fernández Castaño, Adrien Grand)
* GITHUB#12066: Retire/deprecate instance method MMapDirectory#setUseUnmap().
Like the new setting for MemorySegments, this feature is enabled by default and
can only be disabled globally by passing the following sysprop on Java command line:
"-Dorg.apache.lucene.store.MMapDirectory.enableUnmapHack=false" (Uwe Schindler)
* GITHUB#12038: Deprecate non-NRT replication support.
Please migrate to org.apache.lucene.replicator.nrt instead. (Robert Muir)
* GITHUB#12087: Move DocValuesNumbersQuery from sandbox to NumericDocValuesField#newSlowSetQuery
and SortedNumericDocValuesField#newSlowSetQuery. IntField, LongField, FloatField, and DoubleField
implement newSetQuery with best-practice use of IndexOrDocValuesQuery. (Robert Muir)
* GITHUB#12064: Create new KnnByteVectorField, ByteVectorValues and KnnVectorsReader#getByteVectorValues(String)
that are specialized for byte-sized vectors, and clarify the public API by making a clear distinction
between classes that produce and read float vectors and those that produce and read byte vectors. (Ben Trent)
* GITHUB#12101: Remove VectorValues#binaryValue(). Vectors should only be
accessed through their high-level representation, via
VectorValues#vectorValue(). (Adrien Grand)
* GITHUB#12105: Deprecate KnnVectorField in favour of KnnFloatVectorField,
KnnVectoryQuery in favour of KnnFloatVectorQuery, and LeafReader#getVectorValues
in favour of LeafReader#getFloatVectorValues. (Luca Cavanna)
New Features
---------------------
* GITHUB#11795: Add ByteWritesTrackingDirectoryWrapper to expose metrics for bytes merged, flushed, and overall
write amplification factor. (Marc D'Mello)
* GITHUB#11929: MMapDirectory gives more granular control on which files to
preload. (Adrien Grand, Uwe Schindler)
* GITHUB#11999: MemoryIndex now supports stored fields. (Alan Woodward)
* GITHUB#11997: Add IntField, LongField, FloatField and DoubleField: easy to
use numeric fields that perform well both for filtering and sorting.
(Francisco Fernández Castaño)
* GITHUB#12033: Support for Java 19 foreign memory support is now enabled by default,
no need to pass "--enable-preview" on the command line. If exactly Java 19 is used,
MMapDirectory will mmap Lucene indexes in chunks of 16 GiB (instead of 1 GiB) and
indexes closed while queries are running can no longer crash the JVM.
To disable this feature, pass the following sysprop on Java command line:
"-Dorg.apache.lucene.store.MMapDirectory.enableMemorySegments=false" (Uwe Schindler)
* GITHUB#11869: RangeOnRangeFacetCounts added, supporting numeric range "relationship" faceting over docvalue-stored
ranges. (Marc D'Mello)
* LUCENE-10626 Hunspell: add tools to aid dictionary editing:
analysis introspection, stem expansion and stem/flag suggestion (Peter Gromov)
Improvements
---------------------
* GITHUB#11785: Improve Tessellator performance by delaying calls to the method
#isIntersectingPolygon (Ignacio Vera)
* GITHUB#687: speed up IndexSortSortedNumericDocValuesRangeQuery#BoundedDocIdSetIterator
construction using bkd binary search. (Jianping Weng)
* GITHUB#11985: ExitableTerms to override Terms#getMin and Terms#getMax in order to avoid
iterating through the terms when the wrapped implementation caches such values. (Luca Cavanna)
* GITHUB#11860: Improve storage efficiency of connections in the HNSW graph that Lucene uses for
vector search. (Ben Trent)
* GITHUB#12008: Clean up LongRange#verifyAndEncode logic to remove unnecessary NaN checks. (Greg Miller)
* GITHUB#12003: Minor cleanup/improvements to IndexSortSortedNumericDocValuesRangeQuery. (Greg Miller)
* GITHUB#12016: Upgrade lucene/expressions to use antlr 4.11.1 (Andriy Redko)
* GITHUB#12034: Remove null check in IndexReaderContext#leaves() usages (Erik Pellizzon)
* GITHUB#12070: Compound file creation is no longer subject to merge throttling.
(Adrien Grand)
Bug Fixes
---------------------
* GITHUB#11726: Indexing term vectors on large documents could fail due to
trying to apply a dictionary whose size is greater than the maximum supported
window size for LZ4. (Adrien Grand)
* GITHUB#11768: Taxonomy and SSDV faceting now correctly breaks ties by preferring smaller ordinal
values. (Greg Miller)
* GITHUB#11907: Fix latent casting bugs in BKDWriter. (Ben Trent)
* GITHUB#11954: Remove QueryTimeout#isTimeoutEnabled method and move check to caller. (Shubham Chaudhary)
* GITHUB#11950: Fix NPE in BinaryRangeFieldRangeQuery variants when the queried field doesn't exist
in a segment or is of the wrong type. (Greg Miller)
* GITHUB#11990: PassageSelector now has a larger minimum size for its priority queue,
so that subsequent passage merges don't mean that we return too few passages in
total. (Alan Woodward, Dawid Weiss)
* GITHUB#11986: Fix algorithm that chooses the bridge between a polygon and a hole when there is
common vertex. (Ignacio Vera)
* GITHUB#12020: Fixes bug whereby very flat polygons can incorrectly contain intersecting geometries. (Craig Taverner)
* GITHUB#12058: Fix detection of Hotspot in TestRamUsageEstimator so it works with OpenJ9.
(Uwe Schindler)
* GITHUB#12046: Out of boundary in CombinedFieldQuery#addTerm. (Lu Xugang)
* GITHUB#12072: Fix exponential runtime for nested BooleanQuery#rewrite when a
BooleanClause is non-scoring. (Ben Trent)
* GITHUB#11807: Don't rewrite queries in unified highlighter. (Alan Woodward)
* GITHUB#12088: WeightedSpanTermExtractor should not throw UnsupportedOperationException
when it encounters a FieldExistsQuery. (Alan Woodward)
* GITHUB#12084: Same bound with fallbackQuery. (Lu Xugang)
* GITHUB#12077: WordBreakSpellChecker now correctly respects maxEvaluations (hossman)
Optimizations
---------------------
* GITHUB#11738: Optimize MultiTermQueryConstantScoreWrapper when a term is present that matches all
docs in a segment. (Greg Miller)
* GITHUB#11735: KeywordRepeatFilter + OpenNLPLemmatizer always drops last token of a stream.
(Luke Kot-Zaniewski)
* GITHUB#11771: KeywordRepeatFilter + OpenNLPLemmatizer sometimes arbitrarily exits token stream.
(Luke Kot-Zaniewski)
* GITHUB#11803: DrillSidewaysScorer has improved to leverage "advance" instead of "next" where
possible, and splits out first and second phase checks to delay match confirmation. (Greg Miller)
* GITHUB#11828: Tweak TermInSetQuery "dense" optimization to only require all terms present in a
given field to match a term (rather than all docs in a segment). This is consistent with
MultiTermQueryConstantScoreWrapper. (Greg Miller)
* GITHUB#11876: Use ByteArrayComparator to speed up PointInSetQuery in single dimension case.
(Guo Feng)
* GITHUB#11880: Use ByteArrayComparator to speed up BinaryRangeFieldRangeQuery, RangeFieldQuery
LatLonPointDistanceFeatureQuery and CheckIndex. (Guo Feng)
* GITHUB#11881: Further optimize drill-sideways scoring by specializing the single dimension case
and borrowing some concepts from "min should match" scoring. (Greg Miller)
* GITHUB#11884: Simplify the logic of matchAll() in IndexSortSortedNumericDocValuesRangeQuery. (Lu Xugang)
* GITHUB#11895: count() in BooleanQuery could be early quit. (Lu Xugang)
* GITHUB#11972: `IndexSortSortedNumericDocValuesRangeQuery` can now also
optimize query execution with points for descending sorts. (Adrien Grand)
* GITHUB#12006: Do ints compare instead of ArrayUtil#compareUnsigned4 in LatlonPointQueries. (Guo Feng)
* GITHUB#12011: Minor speedup to flushing long postings lists when an index
sort is configured. (Adrien Grand)
* GITHUB#12017: Aggressive count in BooleanWeight. (Lu Xugang)
* GITHUB#12079: Faster merging of 1D points. (Adrien Grand)
* GITHUB#12081: Small merging speedup on sorted indexes. (Adrien Grand)
* GITHUB#12078: Enhance XXXField#newRangeQuery. (Lu Xugang)
* GITHUB#11857, GITHUB#11859, GITHUB#11893, GITHUB#11909: Hunspell: improved suggestion performance (Peter Gromov)
Other
---------------------
* GITHUB#11856: Fix nanos to millis conversion for tests (Marios Trivyzas)
* LUCENE-10423: Remove usages of System.currentTimeMillis() from tests. (Marios Trivyzas)
* GITHUB#11811: Upgrade google java format to 1.15.0 (Dawid Weiss)
* GITHUB#11834: Upgrade forbiddenapis to version 3.4. (Uwe Schindler)
* LUCENE-10635: Ensure test coverage for WANDScorer by using a test query. (Zach Chen, Adrien Grand)
* GITHUB#11752: Added interface to relate a LatLonShape with another shape represented as Component2D. (Navneet Verma)
* GITHUB#11983: Make constructors for OffsetFromPositions and OffsetsFromMatchIterator
public. (Alan Woodward)
* LUCENE-10546: Update Faceting user guide. (Egor Potemkin)
* GITHUB#12099: Introduce support in KnnVectorQuery for getters. (Alessandro Benedetti)
Build
---------------------
* GITHUB#11886: Upgrade to gradle 7.5.1 (Dawid Weiss)
======================== Lucene 9.4.2 =======================
Bug Fixes
---------------------
* GITHUB#11905: Fix integer overflow when seeking the vector index for connections in a single segment.
This addresses a bug that was introduced in 9.2.0 where having many vectors is not handled well
in the vector connections reader.
* GITHUB#11939: Fix incorrect cost calculation in DocIdSetBuilder after upgradeToBitSet when doc list is growing.
This addresses a bug where the cost of TermRangeQuery/TermInSetQuery and some other queries will be highly underestimated.
Improvements
---------------------
* GITHUB#11912, GITHUB#11918: Port generic exception handling from MemorySegmentIndexInput
to ByteBufferIndexInput. This also adds the invalid position while seeking or reading
to the exception message. Allows better debugging and analysis of bugs like GITHUB#11905.
(Uwe Schindler, Robert Muir)
* GITHUB#11916: improve checkindex to be more thorough for vectors. (Ben Trent)
======================== Lucene 9.4.1 =======================
Bug Fixes
---------------------
* GITHUB#11858: Fix kNN vectors format validation on large segments. This
addresses a regression in 9.4.0 where validation could fail, preventing
further writes or searches on the index. (Julie Tibshirani)
======================== Lucene 9.4.0 =======================
API Changes
---------------------
* LUCENE-10577: Add VectorEncoding to enable byte-encoded HNSW vectors (Michael Sokolov, Julie Tibshirani)
New Features
---------------------
* LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape. (Nick Knize)
* LUCENE-10629: Support match set filtering with a query in MatchingFacetSetCounts. (Stefan Vodita, Shai Erera)
* LUCENE-10633: SortField#setOptimizeSortWithIndexedData and
SortField#getOptimizeSortWithIndexedData were introduced to provide
an option to disable sort optimization for various sort fields. (Mayya Sharipova)
* GITHUB#912: Support for Java 19 foreign memory support was added. Applications started
with command line parameter "java --enable-preview" will automatically use the new
foreign memory API of Java 19 to access indexes on disk with MMapDirectory. This is
an opt-in feature and requires explicit Java command line flag! When enabled, Lucene logs
a notice using java.util.logging. Please test thoroughly and report bugs/slowness to Lucene's
mailing list. When the new API is used, MMapDirectory will mmap Lucene indexes in chunks of
16 GiB (instead of 1 GiB) and indexes closed while queries are running can no longer crash
the JVM. (Uwe Schindler)
Improvements
---------------------
* LUCENE-10592: Build HNSW Graph on indexing. (Mayya Sharipova, Adrien Grand, Julie Tibshirani)
* LUCENE-10207: TermInSetQuery can now provide a ScoreSupplier with cost estimation, making it
usable in IndexOrDocValuesQuery. (Greg Miller)
* LUCENE-10216: Use MergePolicy to define and MergeScheduler to trigger the reader merges
required by addIndexes(CodecReader[]) API. (Vigya Sharma, Michael McCandless)
* GITHUB#11715: Add Integer awareness to RamUsageEstimator.sizeOf (Mike Drob)
Optimizations
---------------------
* LUCENE-10661: Reduce memory copy in BytesStore. (luyuncheng)
* GITHUB#1020: Support #scoreSupplier and small optimizations to DocValuesRewriteMethod. (Greg Miller)
* LUCENE-10633: Added support for dynamic pruning to queries sorted by a string
field that is indexed with terms and SORTED or SORTED_SET doc values.
(Adrien Grand)
* LUCENE-10627: Using ByteBuffersDataInput reduce memory copy on compressing data. (luyuncheng)
* GITHUB#1062: Optimize TermInSetQuery when a term is present that matches all docs in a segment.
(Greg Miller)
Bug Fixes
---------------------
* LUCENE-10663: Fix KnnVectorQuery explain with multiple segments. (Shiming Li)
* LUCENE-10673: Improve check of equality for latitudes for spatial3d GeoBoundingBox (ignacio Vera)
* LUCENE-10678: Fix potential overflow when building a BKD tree with more than 4 billion points. The overflow
occurs when computing the partition point. (Ignacio Vera)
* LUCENE-10644: Facets#getAllChildren testing should ignore child order. (Yuting Gan)
* LUCENE-10665, GITHUB#11701: Fix classloading deadlock in analysis factories / AnalysisSPILoader
initialization. (Uwe Schindler)
* LUCENE-10674: Ensure BitSetConjDISI returns NO_MORE_DOCS when sub-iterator exhausts. (Jack Mazanec)
* GITHUB#11794: Guard FieldExistsQuery against null pointers (Luca Cavanna)
Build
---------------------
* GITHUB#11720: Upgrade randomizedtesting to 2.8.1 (potential fix for odd wall clock - related
timeout failures). (Dawid Weiss)
* LUCENE-10669: The build should be more helpful when generated resources are touched (Dawid Weiss)
Other
---------------------
* LUCENE-10559: Add Prefilter Option to KnnGraphTester (Kaival Parikh)
======================== Lucene 9.3.0 =======================
API Changes
---------------------
* LUCENE-10603: SortedSetDocValues#NO_MORE_ORDS marked @deprecated in favor of iterating with
SortedSetDocValues#docValueCount(). (Greg Miller)
* GITHUB#978: Deprecate (remove in Lucene 10) obsolete constants in oal.util.Constants; remove
code which is no longer executed after Java 9. (Uwe Schindler)
New Features
---------------------
* LUCENE-10550: Add getAllChildren functionality to facets (Yuting Gan)
* LUCENE-10274: Added facetsets module for high dimensional (hyper-rectangle) faceting
(Shai Erera, Marc D'Mello, Greg Miller)
* LUCENE-10151 Enable timeout support in IndexSearcher. (Deepika Sharma)
Improvements
---------------------
* LUCENE-10078: Merge on full flush is now enabled by default with a timeout of
500ms. (Adrien Grand)
* LUCENE-10585: Facet module code cleanup (copy/paste scrubbing, simplification and some very minor
optimization tweaks). (Greg Miller)
* LUCENE-10603: Update SortedSetDocValues iteration to use SortedSetDocValues#docValueCount().
(Greg Miller, Stefan Vodita)
* LUCENE-10619: Optimize the writeBytes in TermsHashPerField. (Tang Donghai)
* GITHUB#983: AbstractSortedSetDocValueFacetCounts internal code cleanup/refactoring. (Greg Miller)
Optimizations
---------------------
* LUCENE-8519: MultiDocValues.getNormValues should not call getMergedFieldInfos (Rushabh Shah)
* GITHUB#961: BooleanQuery can return quick counts for simple boolean queries.
(Adrien Grand)
* LUCENE-10618: Implement BooleanQuery rewrite rules based for minimumShouldMatch. (Fang Hou)
* LUCENE-10480: Implement Block-Max-Maxscore scorer for 2 clauses disjunction. (Zach Chen, Adrien Grand)
* LUCENE-10606: For KnnVectorQuery, optimize case where filter is backed by BitSetIterator (Kaival Parikh)
* LUCENE-10593: Vector similarity function and NeighborQueue reverse removal. (Alessandro Benedetti)
* GITHUB#984: Use primitive type data structures in FloatTaxonomyFacets and IntTaxonomyFacets
#getAllChildren() internal implementation to avoid some garbage creation. (Greg Miller)
* GITHUB#1010: Specialize ordinal encoding for common case in SortedSetDocValues. (Greg Miller)
* LUCENE-10657: CopyBytes now saves one memory copy on ByteBuffersDataOutput. (luyuncheng)
* GITHUB#1007: Optimize IntersectVisitor#visit implementations for certain bulk-add cases.
(Greg Miller)
* LUCENE-10653: BlockMaxMaxscoreScorer uses heapify instead of individual adds. (Greg Miller)
Changes in runtime behavior
---------------------
* GITHUB#978: IndexWriter diagnostics written to index only contain java's runtime version
and vendor. (Uwe Schindler)
Bug Fixes
---------------------
* LUCENE-10574: Prevent pathological O(N^2) merging. (Adrien Grand)
* LUCENE-10584: Properly support #getSpecificValue for hierarchical dims in SSDV faceting.
(Greg Miller)
* LUCENE-10582: Fix merging of overridden CollectionStatistics in CombinedFieldQuery (Yannick Welsch)
* LUCENE-10563: Fix failure to tessellate complex polygon (Craig Taverner)
* LUCENE-10605: Fix error in 32bit jvm object alignment gap calculation (Sun Wuqiang)
* GITHUB#956: Make sure KnnVectorQuery applies search boost. (Julie Tibshirani)
* LUCENE-10598: SortedSetDocValues#docValueCount() should be always greater than zero. (Lu Xugang)
* LUCENE-10600: SortedSetDocValues#docValueCount should be an int, not long (Lu Xugang)
* LUCENE-10611: Fix failure when KnnVectorQuery has very selective filter (Kaival Parikh)
* LUCENE-10607: Fix potential integer overflow in maxArcs computions (Tang Donghai)
* GITHUB#986: Fix FieldExistsQuery rewrite when all docs have vectors. (Julie Tibshirani)
* LUCENE-10623: Error implementation of docValueCount for SortingSortedSetDocValues (Lu Xugang)
* GITHUB#1028: Fix error in TieredMergePolicy (Lin Jian)
Other
---------------------
* GITHUB#991: Update randomizedtesting to 2.8.0, hppc to 0.9.1, morfologik to 2.1.9. (Dawid Weiss)
* LUCENE-10370: pass proper classpath/module arguments for forking jvms from within tests. (Dawid Weiss)
* LUCENE-10604: Improve ability to test and debug triangulation algorithm in Tessellator.
(Craig Taverner)
* GITHUB#922: Remove unused and confusing FacetField indexing options (Gautam Worah)
Build
---------------------
* GITHUB#976: Exclude Lucene's own JAR files from classpath entries in Eclipse config.
(Uwe Schindler)
======================= Lucene 9.2.0 =======================
API Changes
---------------------
* LUCENE-10325: Facets API extended to support getTopFacets. (Yuting Gan)
* LUCENE-10482: Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the
taxoEpoch decide. Add a test case that demonstrates the inconsistencies caused when you reuse taxoArrays on older
checkpoints. (Gautam Worah)
* LUCENE-10558: Add new constructors to Kuromoji and Nori dictionary classes to support classpath /
module system usage. It is now possible to use JDK's Class/ClassLoader/Module#getResource(...) apis
and pass their returned URL to dictionary constructors to load resources from Classpath or Module
resources. (Uwe Schindler, Tomoko Uchida, Mike Sokolov)
New Features
---------------------
* LUCENE-10312: Add PersianStemmer based on the Arabic stemmer. (Ramin Alirezaee)
* LUCENE-10539: Return a stream of completions from FSTCompletion. (Dawid Weiss)
* LUCENE-10385: Implement Weight#count on IndexSortSortedNumericDocValuesRangeQuery
to speed up computing the number of hits when possible. (Lu Xugang, Luca Cavanna, Adrien Grand)
* LUCENE-10422: Monitor Improvements: `Monitor` can use a custom `Directory`
implementation. `Monitor` can be created with a readonly `QueryIndex` in order to
have readonly `Monitor` instances. (Niko Usai)
* LUCENE-10456: Implement rewrite and Weight#count for MultiRangeQuery
by merging overlapping ranges . (Jianping Weng)
* LUCENE-10444: Support alternate aggregation functions in association facets. (Greg Miller)
Improvements
---------------------
* LUCENE-10229: return -1 for unknown offsets in ExtendedIntervalsSource. Modify highlighting to
work properly with or without offsets. (Dawid Weiss)
* LUCENE-10494: Implement method to bulk add all collection elements to a PriorityQueue.
(Bauyrzhan Sakhariyev)
* LUCENE-10484: Add support for concurrent random sampling by calling
RandomSamplingFacetsCollector#createManager. (Luca Cavanna)
* LUCENE-10467: Throws IllegalArgumentException for Facets#getAllDims and Facets#getTopChildren
if topN <= 0. (Yuting Gan)
* LUCENE-9848: Correctly sort HNSW graph neighbors when applying diversity criterion (Mayya
Sharipova, Michael Sokolov)
* LUCENE-10527: Use 2*maxConn for the last layer in HNSW (Mayya Sharipova)
Optimizations
---------------------
* LUCENE-10555: avoid NumericLeafComparator#iteratorCost repeated initialization
when NumericLeafComparator#setScorer is called. (Jianping Weng)
* LUCENE-10452: Hunspell: call checkCanceled less frequently to reduce the overhead (Peter Gromov)
* LUCENE-10451: Hunspell: don't perform potentially expensive spellchecking after timeout (Peter Gromov)
* LUCENE-10418: More `Query#rewrite` optimizations for the non-scoring case.
(Adrien Grand)
* LUCENE-10436: Deprecate DocValuesFieldExistsQuery, NormsFieldExistsQuery and KnnVectorFieldExistsQuery
with FieldExistsQuery. (Zach Chen, Michael McCandless, Adrien Grand)
* LUCENE-10481: FacetsCollector will not request scores if it does not use them. (Mike Drob)
* LUCENE-10503: Potential speedup for pure disjunctions whose clauses produce
scores that are very close to each other. (Adrien Grand)
* LUCENE-10315: Use SIMD instructions to decode BKD doc IDs. (Guo Feng, Adrien Grand, Ignacio Vera)
* LUCENE-8836: Speed up calls to TermsEnum#lookupOrd on doc values terms enums
and sequences of increasing ords. (Bruno Roustant, Adrien Grand)
* LUCENE-10536: Doc values terms dictionaries now use the first (uncompressed)
term of each block as a dictionary when compressing suffixes of the other 63
terms of the block. (Adrien Grand)
* LUCENE-10411: Add nearest neighbors vectors support to ExitableDirectoryReader.
(Zach Chen, Adrien Grand, Julie Tibshirani, Tomoko Uchida)
* LUCENE-10542: FieldSource exists implementations can avoid value retrieval (Kevin Risden)
* LUCENE-10534: MinFloatFunction / MaxFloatFunction exists check can be slow (Kevin Risden)
* LUCENE-10496: Queries sorted by field now better handle the degenerate case
when the search order and the index order are in opposite directions.
(Jianping Weng)
* LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle
ordToDoc in HNSW vectors (Lu Xugang)
* LUCENE-10488: Facets#getTopDims optimized for taxonomy faceting and
ConcurrentSortedSetDocValuesFacetCounts. (Yuting Gan)
Bug Fixes
---------------------
* LUCENE-10477: Highlighter: WeightedSpanTermExtractor.extractWeightedSpanTerms to Query#rewrite
multiple times if necessary. (Christine Poerschke, Adrien Grand)
* LUCENE-10491: A correctness bug in the way scores are provided within TaxonomyFacetSumValueSource
was fixed. (Michael McCandless, Greg Miller)
* LUCENE-10466: Ensure IndexSortSortedNumericDocValuesRangeQuery handles sort field
types besides LONG (Andriy Redko)
* LUCENE-10292: Suggest: Fix AnalyzingInfixSuggester / BlendedInfixSuggester to correctly return
existing lookup() results during concurrent build(). Fix other FST based suggesters so that
getCount() returned results consistent with lookup() during concurrent build(). (hossman)
* LUCENE-10508: Fixes some edge cases where GeoArea were built in a way that vertical planes
could not evaluate their sign, either because the planes where the same or the center between those
planes was lying in one of the planes. (Ignacio Vera)
* LUCENE-10495: Fix return statement of siblingsLoaded() in TaxonomyFacets. (Yuting Gan)
* LUCENE-10533: SpellChecker.formGrams is missing bounds check (Kevin Risden)
* LUCENE-10529: Properly handle when TestTaxonomyFacetAssociations test case randomly indexes
no documents instead of throwing an NPE. (Greg Miller)
* LUCENE-10470: Check if polygon has been successfully tessellated before we fail (we are failing some valid
tessellations) and allow filtering edges that fold on top of the previous one. (Ignacio Vera)
* LUCENE-10530: Avoid floating point precision test case bug in TestTaxonomyFacetAssociations.
(Greg Miller)
* LUCENE-10552: KnnVectorQuery has incorrect equals/ hashCode. (Lu Xugang)
* LUCENE-10558: Restore behaviour of deprecated Kuromoji and Nori dictionary constructors for
custom dictionary support. Please also use new URL-based constructors for classpath/module
system ressources. (Uwe Schindler, Tomoko Uchida, Mike Sokolov)
* LUCENE-10564: Make sure SparseFixedBitSet#or updates ramBytesUsed. (Julie Tibshirani)
Build
---------------------
* GITHUB#768: Upgrade forbiddenapis to version 3.3. (Uwe Schindler)
* GITHUB#890: Detect CI builds on Github or Jenkins and enable errorprone. (Uwe Schindler, Dawid Weiss)
* LUCENE-10532: Remove LuceneTestCase.Slow annotation. All tests can be fast. (Robert Muir)
Other
---------------------
* LUCENE-10526: Test-framework: Add FilterFileSystemProvider.wrapPath(Path) method for mock filesystems
to override if they need to extend the Path implementation. (Gautam Worah, Robert Muir)
* LUCENE-10525: Test-framework: Add detection of illegal windows filenames to WindowsFS. (Gautam Worah)
* LUCENE-10541: Test-framework: limit the default length of MockTokenizer tokens to 255.
(Robert Muir, Uwe Schindler, Tomoko Uchida, Dawid Weiss)
* GITHUB#854: Allow to link to GitHub pull request from CHANGES. (Tomoko Uchida, Jan Høydahl)
======================= Lucene 9.1.0 =======================
API Changes
---------------------
* LUCENE-10244: MultiCollector::getCollectors is now public, allowing users to access the wrapped
collectors. (Andriy Redko)
* LUCENE-10197: UnifiedHighlighter now has a Builder to construct it. The UH's setters are now
deprecated. (Animesh Pandey, David Smiley)
* LUCENE-10301: the test framework is now a module. All the classes have been moved from
org.apache.lucene.* to org.apache.lucene.tests.* to avoid package name conflicts with the
core module. (Dawid Weiss)
* LUCENE-10183: KnnVectorsWriter#writeField to take KnnVectorsReader instead of VectorValues.
(Zach Chen, Michael Sokolov, Julie Tibshirani, Adrien Grand)
* LUCENE-10335: Deprecate helper methods for resource loading in IOUtils and StopwordAnalyzerBase
that are not compatible with module system (Class#getResourceAsStream() and Class#getResource()
are caller sensitive in Java 11). Instead add utility method IOUtils#requireResourceNonNull(T)
to test existence of resource based on null return value. (Uwe Schindler, Dawid Weiss)
* LUCENE-10349: WordListLoader methods now return unmodifiable CharArraySets. (Uwe Schindler)
* LUCENE-10377: SortField.getComparator() has changed signature. The second parameter is now
a boolean indicating whether or not skipping should be enabled on the comparator.
(Alan Woodward)
* LUCENE-10381: Require users to provide FacetsConfig for SSDV faceting. (Greg Miller)
* LUCENE-10368: IntTaxonomyFacets has been deprecated and is no longer a supported extension point
for user-created faceting implementations. (Greg Miller)
* LUCENE-10400: Add constructors that take external resource Paths to dictionary classes in Kuromoji and Nori:
ConnectionCosts, TokenInfoDictionary, and UnknownDictionary. Old constructors that take resource scheme and
resource path in those classes are deprecated; These are replaced with the new constructors and planned to be
removed in a future release. (Tomoko Uchida, Uwe Schindler, Mike Sokolov)
* LUCENE-10050: Deprecate DrillSideways#search(Query, Collector) in favor of
DrillSideways#search(Query, CollectorManager). This reflects the change (LUCENE-10002) being made in
IndexSearcher#search that trends towards using CollectorManagers over Collectors. (Gautam Worah)
* LUCENE-10420: Move functional interfaces in IOUtils to top-level interfaces.
(David Smiley, Uwe Schindler, Dawid Weiss, Tomoko Uchida)
* LUCENE-10398: Add static method for getting Terms from LeafReader. (Spike Liu)
* LUCENE-10440: TaxonomyFacets and FloatTaxonomyFacets have been deprecated and are no longer
supported extension points for user-created faceting implementations. (Greg Miller)
* LUCENE-10431: MultiTermQuery.setRewriteMethod() has been deprecated, and constructor
parameters for the various implementations added. (Alan Woodward)