-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After update to Lucene 9.4 use --enable-preview
on Java==19 (exact) to allow mmap use new JDK APIs
#90526
Comments
Pinging @elastic/es-delivery (Team:Delivery) |
Pinging @elastic/es-core-infra (Team:Core/Infra) |
@rdnm Is this something the core/infra team could look into, given this would probably be implemented in |
Hi @mark-vieira
Possibly, but I don't know how this works. The JVM options must be set before the main server is started as those are not system properties. It is real JVM options. If JvmErgonomics is called from a tool that runs before the main server with the same JVM, it could just do |
Right, that's where it would happen. The purpose of the jvm ergonomics code is to "compute" JVM arguments based on whatever environmental conditions we decide. |
As this can impact performance, we probably want an ES-level option to enable this for 8.6 (default probably off, as this is a preview feature). I also note this is a general option that enables all preview APIs - although the other previews are opt-in or compiler only (record patterns, switch patterns, virtual threads). Given this, we probably want this for 19 only, we can then re-evaluate the previews for 20 when that is released. |
JUL redirection is covered by #90547 |
We probably don't want this "on by default", at least not initially - it seems less than idea to rely on Preview features in production. And as @thecoop says, enabling is a JVM-wide property. Initial steps can be to setup a CI job with Java 19 that passes Folks can also opt-in by the same mechanism, passing |
@uschindler Is it enough to specify this as a direct JVM option for now? |
Actually you misunderstood what Actually the problem here is: I would really like to see people use it because this is now 2.5 years already tested daily on Lucene benchmarks and the performance is same (as of JDK 18 incubation and now also with JDK 19). The only open point is possibly a slowdown on many close/reopen of indexes. But this will also be there in 20 and 21 as it is a feature by design and theres no way to prevent those safepoints. That said, in Java 21 there will be no way back, Lucene will switch to that code without any way back (for safety reason), although it might slightly raise the level of safepoints or slowdown code heavily closing index files. So please enable this feature by default if JDK 19 is used, otherwise theres no way to see results. You can add an option to DISABLE it. I also have customers in the chain who fail with this max_map_count sysctl. Those will be very happy. |
I'm not sure that we've even tested with --enable-preview yet, but regardless we should. Also, and bigger, we'll need to look at the performance characteristics of the Panama Memory segment MMapDirectory implementation - which is yet to be done. (thanks for highlighting the use of thread local handshakes when closing) I somewhat dislike that enabling this in Lucene will amount to enabling all preview features at runtime (from the perspective of the JVM and the Java Platform). But since we don't compile with preview features enabled, then maybe this is not such an issue, but still... leaves a small window open for plugins, etc. To clarify (from a runtime perspective) - preview features are more than just Preview APIs. Setting --enable-preview enables additional class file support in the JVM. |
From the runtime perspective, that is the only thing it does. It allows to load classes with minor version number 65535 when the major version is exactly matching the runtime's bytecode version. Basically javac "marks" all class files using preview APIs or preview features of Javac with that minor version as "flag": "This class is using new features". The instanceof switch statement for example or the records are/were such features. |
Funny detail: If Lucene's build would have removed the 65535 minor version from the class files (I tried this out), the classes load also without preview flag and the new impl in lucene would work out of box without any preview flags. But the risk for Lucene is too large: if APIs change in 20 (and they possibly will), it will produce linkage errors. But it was tempting :-) |
In the case of what we're discussing here, the Lucene Panama Memory segment MMapDirectory implementation, I agree - there is little or no risk. More generally, allowing classes with minor version 65535 is just the start of what is enabled by preview features. As a specific example, when I worked on the records preview, the JVM would only parse the Record_attribute if preview was enabled - it would otherwise fail to parse that attribute if found in another class file (this is a bit of a silly example). I'm not sure of the specifics of Java 19 preview features, so I dunno what could be lurking there (if anything), and how dangerous it could be if ES or a plugin depended upon one. Though these days, most features are implemented through indy and standard runtime bootstraps, rather than new byte codes. But still, enabling preview affects code paths in the JVM and runtime. |
If nothing uses it a JDK with and without preview enabled should exactly the same. If a plugin depends on preview features it would require preview enabled, too, so why should there be any plugin with this? I clearly said in the title: "use --enable-preview with Lucene in Java ==19 (exact)". Any other combination does not need preview enabled. Anyways: if a plugin would like to use those features, it can just use reflection to lookup public APIs, so there's no security problem and not even the good old security manager can prevent you from doing this inside the plugin. The module system is also outside. I think you may mix up incubating and preview. Incubation features are really problematic. Preview APIs are always available (read the spec!) to any code if you use MethodHandles/Reflection/whatever, they are just not available to statically compiled code: So any code can use for example MemorySegment with Java 19 - by reflection. If I would have known about your problems with it, I would have stripped the MR-JAR classes in Lucene off their preview bit (a few lines of our JAR packaging in Gradle) and would have written the code to enable it with a simple if/then/else based on Java version. But do whatever you think is good, but please don't complain when JDK 21 comes out and no productive tests were done. As the current code is a frequent issue for crashes and requires direct access to internal APIs, passing a Uwe P.S.: Damn that I did not strip the preview bit! |
I think we agree here. We've been testing on Java 19 since the EA release builds were out but we haven't done any testing with preview features enabled, primarily because we didn't have any code that leveraged them. So before we would make any real decisions here we'll need to actually test this alternative implementation.
I think what @ChrisHegarty is potentially pushing back against is making this behavior the default in Elasticsearch. There is nothing stopping users from enabling this themselves on a Java 19 runtime.
I assume this is hyperbole, but no one would argue for such a long incubation period. In fact, I think Elasticsearch has done a pretty good job of staying ahead of the pace here. We made the decision to baseline Elasticsearch 8.0 on Java 17 shortly after it went GA and we have an open PR to bump the bundled runtime to Java 19 already as well. So I don't think there's any fundamental reluctance to taking advantage of new language and runtime features. The difference here is that historically we've at least waited for these features to go GA first before enabling them for all users. Perhaps as you say, a feature being "preview" is just a matter of semantics, but the point remains that we don't have an precedence here. |
of course this was more pointing into the direction of OpenJDK not Elasticsearch. :-)
The semantics here are clearly focused on allowing the APIs to change and not like "risky". The code for Panama is already largely inside the code modules of Java since already Java 16. Also the current code using MappedByteBuffer goes through the same code paths over and over (because the ScopedMemoryAccess which was the enabler for the new unmapping semantics was also added to MappedByteBuffer and DirectByteBuffer in Java 16, so you already use it). The |
I've add a CI job which runs against Java 19 with |
That one did not work because Gradle itsself won't run with Java 19. In Lucene we only run tests with JDK 18 or JDK 19, not the build itsself. |
Yep, I got it backwards. I've fixed this so that we set the runtime Java home to JDK 19 rather than the compiler. |
Do you have any benchmarks already for comparison. Just for interest to verify my own investigations? |
Thank you @mark-vieira - to confirm explicitly in this thread; in the logs of the aforementioned CI job, I see messages from Lucene indicating that it is using the Panama Memory segment MMapDirectory implementation: |
Are those logs public? Because the generated files like log files can't be seen from Jenkins. |
FTR - Being intimately involved in JEP 11, JEP 12, the Panama Foreign Memory API, and the core of the JDK, I am fully aware of the impact of non-final features on the JDK and the Java Platform. @uschindler I am largely "with you" when it comes to the very well constrained use in this particular case - the Lucene Memory segment IndexInput implementation. Where I am raising a concern is with the broad impact of enabling preview features JVM-wide by default (this is where we are not in agreement). We now have a CI job that exercises the new implementation - great.
I am not aware of any, but I will do what I can to get some perf numbers. I'm very interested in seeing this.
I'm not sure if this comment is serious or a joke (or somewhere in between). If the former, then please raise it over on the Lucene project. |
I don't think that the individual test-specific result logs are public, just the overall console output from the Gradle run - which is relatively silent for successful test runs. This is the reason I posted an explicit message here confirming that the job is testing what we expect it to. I'm not sure what (if anything) we can do about this, but happy to make enquiries. |
I agree with @ChrisHegarty. Speaking more broadly, do we want ES to, by default, use preview features in the JVM? Preview features are, by definition, not finalised:
The feature may be code-complete, but that doesn't mean it's ready to use in production settings yet. If it were, it would not be a preview feature. There may be obscure bugs still around, or usability problems that need to be updated in a breaking way in the next JVM version - so if users were to update the JVM underneath ES, it could randomly break. We are absolutely not against users having the option to use the new APIs, with all the benefits they bring, and if they wish to do so, we are happy to provide functionality so they can turn it on on their own systems. But ES is used in many thousands of deployments, processing terabytes of data that many companies critically depend on. Should we, by default (and so making the decision on our customers behalf), use features that are not yet finalised in the JVM in those production systems? We would need a very, very strong reason to do so, and I don't think we've met that bar here. |
Hi, I would suggest to do the following:
Another idea would be to enable only that option if you run Elastcisearch in Develope and not Production mode!?
Thanks!
That was a joke :-) But it is tempting to strip the preview bit! Because we use reflection/methodhandles to initialize the correct provider class that creates the IndexInputs anyways, we could also force to use the new impl when exactly Java 19 is detected. But in Lucene 9.4 I wanted to have it opt-in which perfectly fits the When discussing about this with Robert Muir in summer we discussed all those options :-) Uwe |
The requirement to have an opt-in is good, but why does it have to be with |
The problem is that the user has to enable preview anyways, because we don't patch away the classfile minor version. If we would compile the code with java 19, then remove preview bit, then we could do this. |
Right. That is my question - why do you not do this (given your arguments so far in this thread)? Why impose the system-wide enablement of all preview features in all contexts of the running JVM, if you just want to enable one small code path. |
We can think about this for next version (Java 20). I have to still verify if there are any other problems with it. Of course the Lucene bootstrapping code needs to be changed. |
I checked the preview features available in 19:
From that list: Other than virtual threads theres nothing outsde of panama that gets enabled. |
While I'm supportive of testing and benchmarking this new Panama-based directory, I agree with Chris, Mark and Simon on not enabling it in production by default. |
See elastic/elasticsearch#90526 for context
See elastic/elasticsearch#90526 for context
See elastic/elasticsearch#90526 for context
FYI, I forgot to mention this here. Enabling preview will be obsolete with Lucene 9.5: apache/lucene#12033 So Panama will be used automatically when Java 19 (and soon Java 20, see apache/lucene#12042) was detected. Whatever comes first, Lucene 9.5 or Java 20, you will get both variants with Lucene 9.5. So I think we can close this issue? |
We might want to document, or otherwise note, the presence of the |
Neither tests nor benchmarks have noticed a difference after this change, so I'm keen on not documenting this (super expert) system property. Closing. Thanks @uschindler! |
By the way, I would suggest to change the code in this other PR where it applies some regex to the system.out/err output to suppress the messages. Instead, I'd suggest to use log4j-jul module (https://logging.apache.org/log4j/log4j-2.17.0/log4j-jul/index.html) so all Lucene logging goes through Log4J. Then its easy to suppress the warnings and info messages. (I was a bit shocked about this strange filtering of system out) |
FYI, I wanted to show you this commit, unfortunately in this other forked repository. They use log4j-jul module and configure it in LogConfigurator class: You can then simply set loglevel org.apache.lucene.store package to "WARN"/"ERROR"/"NONE" in your logging config and users can configure it as usual. It is generally recommended to enable log4j-jul because also other parts of Lucene now seldomly log messages on failures/error or misconfiguration. Sorry for posting this here. |
Thanks for the additional note @uschindler. I captured the suggestion in #94613 |
Description
Apache Lucene 9.4 will have support for Java 19 Panama APIs to mmap index files (using a MR-JAR). See apache/lucene#912 for more information.
As those APIs are not yet enabled by default in the JDK, we have to still use some opt-in approach, controlled by Java's command line:
MappedByteBuffer
and several hacks which may also risk crushing the JDK if an open index is closed from another thread while a search is running (this is well known). If Java 19 is detected, Lucene will log a warning through JUL whenMMapDirectory
is initialized (see below).--enable-preview
to the Java command line (next to heap settings), it will enable preview APIs in JDK (https://openjdk.org/jeps/12). Lucene detects this and switchesMMapDirectory
and uses a new implementationMemorySegmentIndexInput
for the inputs to use those new APIs (at moment it will also log this as "info" message to JUL). The new APIs are safe and can no longer crush the JVM. But most importantly, all index files are now mapped in portions of 16 GiB instead of 1 GiB into memory. In fact, unless an index is force-merged to one segment, all index files will then consist only of one memory mapping spawning the whole file! This will help Hotspot to further optimize reading as only one implementation onMemorySegmentIndexInput
ist used. In addition, because the number of mappings is dramatically reduced (approximately 5 times less mappings, because the maximum segment size is 5 Gigabytes by default and all such segments now use one instead of 5 mappings). This may allow users to no longer change sysctl (see https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html) and go with defaults of OS. On the other hand users may host more indexes with many more segments on one node.Some TODOs:
java.util.logging
(JUL) to its own log file, so they do not land in console. This can be done with log4j by adding the log4j-jul adapter and install it using a system property in the Bootstrap classes. I have not checked if this is already done. The reason for this is that Apache Lucene now logs some events using java.util.logging since Lucene 9.0. Some of those events areMMapDirectory
messages (e.g., when unmapping was not working) or few others like some module system settings are incorrect. Logging is very seldom, but for this feature it will definitely log using JUL, so it would be good to make sure Elasticsearch redirects JUL logging correctly to its own loggers. This could be a separate issue!--enable-preview
as command line flag if exactly Java 19 is used to start up Elasticsearch. If this is not done, a warning gets logged (see above).Important: Lucene 9.4 only supports this on Java 19 (exactly), because the APIs are in flux. If you start with Java 20, it falls back to the classical
MMapDirectory
. We will add support for Java 20 in a later release. The reason for this is that the class files of new implementation are marked by some special version numbers that make them ONLY compatible to Java 19, not earlier or later, to allow the JDK to apply changes to the API before final release in Java 21. But passing--enable-preview
to later versions won't hurt, so maybe enable it on all versions >= 19.A last note: The downside of this new code is that closing and unmapping an index file gets more heavy (it will trigger an safepoint in the JVM). We have not yet found out how much this impacts servers opening/closing index files a lot. Because of this we would really like Elastic / Elasticsearch to do benchmarking on this, ideally if their users and customers could optionally enable it. But benchmarking should be done now, because with hopefully Java 21, Lucene will use the new implementation by default. Java 20 will be the second and last preview round.
The text was updated successfully, but these errors were encountered: