Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-5524: [C++] Turn off PARQUET_BUILD_ENCRYPTION in CMake if OpenSSL not found #4494

Merged
merged 5 commits into from
Jun 16, 2019

Conversation

nealrichardson
Copy link
Member

@@ -731,6 +733,12 @@ if(ARROW_WITH_BROTLI)
include_directories(SYSTEM ${BROTLI_INCLUDE_DIR})
endif()

if(PARQUET_BUILD_ENCRYPTION)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think invoking find_package twice (again on line 743) is redundant. Why not modify the later one to QUIET and disable both PARQUET_BUILD_ENCRYPTION and ARROW_WITH_GRPC if OPENSSL_FOUND is false

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a question of what is the appropriate default build configuration.

  • If -DARROW_PARQUET=ON, and OpenSSL is available, build with encryption, otherwise OFF
  • If user opts in to building encryption (is there an option for this now? consistently named and documented?) and OpenSSL is not available, fail with error
  • If user passes ARROW_FLIGHT=ON, then OpenSSL is required so not finding it should trigger an error also

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was more my thinking. I didn't think the same behavior was acceptable for ARROW_WITH_GRPC.

I'll take a stab at implementing the logic that Wes outlined. That sounds right to me.

@nealrichardson
Copy link
Member Author

nealrichardson commented Jun 7, 2019

Ok, I've reworked it along these lines:

  • New CMake flag ARROW_REQUIRE_ENCRYPTION, default OFF. If on, OpenSSL must be found and it will be used as needed (Flight and Parquet encryption).
  • If ARROW_FLIGHT is ON, ARROW_REQUIRE_ENCRYPTION is turned ON
  • If ARROW_PARQUET is ON, CMake will look for OpenSSL and use it if it is found, but it will not fail if it is not found.
  • PARQUET_BUILD_ENCRYPTION has been removed. CMake warns if you specify it but it does not break.

I.e., the (recently added) default behavior of having Parquet encryption on by default is preserved, but we degrade gracefully if OpenSSL isn't found. To require Parquet encryption and fail if OpenSSL isn't found, turn on ARROW_REQUIRE_ENCRYPTION.

I manually tested all of the branches of this logic locally.

@nealrichardson
Copy link
Member Author

I can fix the lint error, but I'm not clear on how I would have caused the other Travis failure, https://travis-ci.org/apache/arrow/jobs/542945692#L1348-L1364

@@ -233,6 +233,8 @@ Note that this requires linking Boost statically" OFF)

define_option(ARROW_WITH_ZLIB "Build with zlib compression" ON)

define_option(ARROW_REQUIRE_ENCRYPTION "Fail if OpenSSL is not found" OFF)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be PARQUET_REQUIRE_ENCRYPTION and be added to the PARQUET option list?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind, I see that this is used for ARROW_FLIGHT as well.

Copy link
Contributor

@majetideepak majetideepak Jun 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should use the previous define_option(PARQUET_BUILD_ENCRYPTION "Build Parquet with encryption support" ON). We already have the other flag ARROW_FLIGHT where encryption is required.
The build logic will be:

  1. Turn OFF PARQUET_BUILD_ENCRYPTION if ARROW_PARQUET is OFF
  2. Error if either PARQUET_BUILD_ENCRYPTION or ARROW_FLIGHT is ON and OpenSSL is not found.
  3. By default enable PARQUET_BUILD_ENCRYPTION if OpenSSL is available and ARROW_PARQUET is ON
if (PARQUET_BUILD_ENCRYPTION AND NOT ARROW_PARQUET)
  set(PARQUET_BUILD_ENCRYPTION OFF)
endif()
if(PARQUET_BUILD_ENCRYPTION OR ARROW_FLIGHT)
  # This must work
  find_package(OpenSSL REQUIRED)
  set(ARROW_USE_OPENSSL ON)
elseif(ARROW_PARQUET)
  # Enable Parquet encryption if OpenSSL is there, but don't fail if it's not
  find_package(OpenSSL QUIET)
  CMAKE_DEPENDENT_OPTION(ARROW_USE_OPENSSL "Build with OpenSSL support" ON
                         "OPENSSL_FOUND" OFF)
  # use this flag to include the relevant C++ encryption support files and headers for build
  CMAKE_DEPENDENT_OPTION(PARQUET_BUILD_ENCRYPTION "Build Parquet with encryption support" ON
                         "OPENSSL_FOUND" OFF)
endif()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC what you're proposing is equivalent to the status quo because the default of PARQUET_BUILD_ENCRYPTION is ON. So if I build with Parquet support but OpenSSL isn't found, the build fails. That's the behavior I was trying to remedy here, and why I introduced a new flag with REQUIRE in the name to make explicit that if you turn it on, it must work (but it's OFF by default).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! yes! How about setting the default PARQUET_BUILD_ENCRYPTION to OFF in my proposal?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention here is to have a CMake variable with a name that describes the parquet encryption support.

@majetideepak
Copy link
Contributor

The other Travis Failure was just environmental. I relaunched that test and it passes.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on the principle, just a comment.

elseif(ARROW_PARQUET)
# Enable Parquet encryption if OpenSSL is there, but don't fail if it's not
find_package(OpenSSL QUIET)
CMAKE_DEPENDENT_OPTION(ARROW_USE_OPENSSL "Build with OpenSSL support" ON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the appropriate function. The doc says: """This macro presents an option to the user only if a set of other conditions are true."""
But we are always exposing ARROW_USE_OPENSSL to the user. I think you simply want to set ARROW_USE_OPENSSL conditionally instead.

@majetideepak
Copy link
Contributor

@nealrichardson is there anything else blocking you on this? I want to get this in early since I plan to base other OpenSSL checks off this. I can help with the suggested changes if you are tied up with something else. Please let me know. Thanks!

@nealrichardson
Copy link
Member Author

@majetideepak Thanks for the nudge. Can we compromise on calling the flag PARQUET_REQUIRE_ENCRYPTION or similar? I was wary of just changing the default of PARQUET_BUILD_ENCRYPTION because it's not descriptive of its behavior, i.e. in your last proposal, even with it OFF you'd still get the ON behavior if OpenSSL were available. And with REQUIRE in the name, it makes clear to me that it will fail if the requirement isn't met.

@nealrichardson
Copy link
Member Author

@majetideepak just pushed along those lines, LMK what you think. I haven't yet gone back and re-tested all of the branches of this logic as I did before but I can if you agree with this approach.

@pitrou I believe I've addressed your point too, thanks for catching that.

@nealrichardson
Copy link
Member Author

@ursabot crossbow package wheel conda linux

@ursabot
Copy link

ursabot commented Jun 14, 2019

AMD64 Conda Crossbow (#19778) builder has been succeeded.

Revision: d279123

Submitted crossbow builds: ursa-labs/crossbow @ ursabot-11

Copy link
Contributor

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

@majetideepak
Copy link
Contributor

AMD64 Conda Python 2.7 is failing on a different issue. I will merge this by the end of today if there are no other comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants