-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-5524: [C++] Turn off PARQUET_BUILD_ENCRYPTION in CMake if OpenSSL not found #4494
ARROW-5524: [C++] Turn off PARQUET_BUILD_ENCRYPTION in CMake if OpenSSL not found #4494
Conversation
@@ -731,6 +733,12 @@ if(ARROW_WITH_BROTLI) | |||
include_directories(SYSTEM ${BROTLI_INCLUDE_DIR}) | |||
endif() | |||
|
|||
if(PARQUET_BUILD_ENCRYPTION) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think invoking find_package
twice (again on line 743) is redundant. Why not modify the later one to QUIET and disable both PARQUET_BUILD_ENCRYPTION
and ARROW_WITH_GRPC
if OPENSSL_FOUND
is false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a question of what is the appropriate default build configuration.
- If -DARROW_PARQUET=ON, and OpenSSL is available, build with encryption, otherwise OFF
- If user opts in to building encryption (is there an option for this now? consistently named and documented?) and OpenSSL is not available, fail with error
- If user passes ARROW_FLIGHT=ON, then OpenSSL is required so not finding it should trigger an error also
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that was more my thinking. I didn't think the same behavior was acceptable for ARROW_WITH_GRPC
.
I'll take a stab at implementing the logic that Wes outlined. That sounds right to me.
Ok, I've reworked it along these lines:
I.e., the (recently added) default behavior of having Parquet encryption on by default is preserved, but we degrade gracefully if OpenSSL isn't found. To require Parquet encryption and fail if OpenSSL isn't found, turn on I manually tested all of the branches of this logic locally. |
I can fix the lint error, but I'm not clear on how I would have caused the other Travis failure, https://travis-ci.org/apache/arrow/jobs/542945692#L1348-L1364 |
@@ -233,6 +233,8 @@ Note that this requires linking Boost statically" OFF) | |||
|
|||
define_option(ARROW_WITH_ZLIB "Build with zlib compression" ON) | |||
|
|||
define_option(ARROW_REQUIRE_ENCRYPTION "Fail if OpenSSL is not found" OFF) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be PARQUET_REQUIRE_ENCRYPTION
and be added to the PARQUET option list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
never mind, I see that this is used for ARROW_FLIGHT
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use the previous define_option(PARQUET_BUILD_ENCRYPTION "Build Parquet with encryption support" ON)
. We already have the other flag ARROW_FLIGHT
where encryption is required.
The build logic will be:
- Turn OFF PARQUET_BUILD_ENCRYPTION if ARROW_PARQUET is OFF
- Error if either
PARQUET_BUILD_ENCRYPTION
orARROW_FLIGHT
is ON and OpenSSL is not found. - By default enable PARQUET_BUILD_ENCRYPTION if OpenSSL is available and ARROW_PARQUET is ON
if (PARQUET_BUILD_ENCRYPTION AND NOT ARROW_PARQUET)
set(PARQUET_BUILD_ENCRYPTION OFF)
endif()
if(PARQUET_BUILD_ENCRYPTION OR ARROW_FLIGHT)
# This must work
find_package(OpenSSL REQUIRED)
set(ARROW_USE_OPENSSL ON)
elseif(ARROW_PARQUET)
# Enable Parquet encryption if OpenSSL is there, but don't fail if it's not
find_package(OpenSSL QUIET)
CMAKE_DEPENDENT_OPTION(ARROW_USE_OPENSSL "Build with OpenSSL support" ON
"OPENSSL_FOUND" OFF)
# use this flag to include the relevant C++ encryption support files and headers for build
CMAKE_DEPENDENT_OPTION(PARQUET_BUILD_ENCRYPTION "Build Parquet with encryption support" ON
"OPENSSL_FOUND" OFF)
endif()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC what you're proposing is equivalent to the status quo because the default of PARQUET_BUILD_ENCRYPTION
is ON
. So if I build with Parquet support but OpenSSL isn't found, the build fails. That's the behavior I was trying to remedy here, and why I introduced a new flag with REQUIRE
in the name to make explicit that if you turn it on, it must work (but it's OFF
by default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! yes! How about setting the default PARQUET_BUILD_ENCRYPTION
to OFF
in my proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intention here is to have a CMake variable with a name that describes the parquet encryption support.
The other Travis Failure was just environmental. I relaunched that test and it passes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good on the principle, just a comment.
elseif(ARROW_PARQUET) | ||
# Enable Parquet encryption if OpenSSL is there, but don't fail if it's not | ||
find_package(OpenSSL QUIET) | ||
CMAKE_DEPENDENT_OPTION(ARROW_USE_OPENSSL "Build with OpenSSL support" ON |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the appropriate function. The doc says: """This macro presents an option to the user only if a set of other conditions are true."""
But we are always exposing ARROW_USE_OPENSSL
to the user. I think you simply want to set ARROW_USE_OPENSSL
conditionally instead.
@nealrichardson is there anything else blocking you on this? I want to get this in early since I plan to base other OpenSSL checks off this. I can help with the suggested changes if you are tied up with something else. Please let me know. Thanks! |
@majetideepak Thanks for the nudge. Can we compromise on calling the flag |
@majetideepak just pushed along those lines, LMK what you think. I haven't yet gone back and re-tested all of the branches of this logic as I did before but I can if you agree with this approach. @pitrou I believe I've addressed your point too, thanks for catching that. |
@ursabot crossbow package wheel conda linux |
AMD64 Conda Crossbow (#19778) builder has been succeeded. Revision: d279123 Submitted crossbow builds: ursa-labs/crossbow @ ursabot-11 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 LGTM
AMD64 Conda Python 2.7 is failing on a different issue. I will merge this by the end of today if there are no other comments. |
Heavily inspired by https://github.com/apache/thrift/blob/master/build/cmake/DefineOptions.cmake