Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Learnings from the VTK Module Upgrade to modern CMake #337

Open
bartlettroscoe opened this issue Nov 13, 2020 · 16 comments
Open

Learnings from the VTK Module Upgrade to modern CMake #337

bartlettroscoe opened this issue Nov 13, 2020 · 16 comments

Comments

@bartlettroscoe
Copy link
Member

This Issue is to capture things we can learn from the recent upgrade of the Kitware VTK Module system to use modern CMake.

@bartlettroscoe
Copy link
Member Author

Below, I give a conversation that we had about this that I want to archive here. (I have redacted some stuff that is not pertinent).


From: Ben Boeckel
Sent: Thursday, October 15, 2020 9:42 AM
To: Johnson, Seth (ORNL)
Cc: ...
Subject: [EXTERNAL] Re: Virtual meeting: "Migrating from TriBITS to Modern CMake"

On Fri, Oct 09, 2020 at 15:42:22 +0000, Johnson, Seth R. wrote:

Thanks for participating in Wednesday's discussion about TriBITS! ECP
help is still figuring out how to make the slack channel public, so
it's not yet active but I will email you when it is.

In the meantime, I offer the following observations based on our discussion:

  • TriBITS has many features, though not all TriBITS projects want
    or need to use them. It might provide a factor-of-two code/effort
    reduction for the functionality it provides, at the cost of a
    factor-of-ten effort increase for use cases not covered by its
    design. It is a prescriptive framework designed for an idealized
    multi-repository workflow.
  • Projects that use TriBITS tend to have an overlapping set of
    requirements. In my experiences, our needs and use cases have
    increasingly diverged from TriBITS, leading to an increase in
    development/maintenance cost for the build system.
  • I'm not convinced a prescriptive/one-size-fits-all approach is
    best for build systems: in the case of ForTrilinos, I used plain
    CMake when possible and local functions/macros when the CMake got
    repetitive, and at the end of the day ended up with less CMake code
    compared to the TriBITS-based system.
  • We need to identify the overlap between project needs and the
    design goals of a hypothetical TriBITS replacement/rewrite/upgrade.

I've found the same with VTK's upgrade to new CMake code. As long as your practical interface is via targets, everything should interoperate well. The VTK module system transports information for itself via properties on the targets and aren't special other than that. It can consume non-VTK-module targets just fine and non-VTK-module targets can consume VTK's module targets without having to buy-in on the module system itself. If you do, it can help with things like Python wrapping and such though (using VTK's wrapping tools).

There's one nit that can't be expressed in usage requirements (since they're form "if depends_on(A) and depends_on(B) then add_extra_flag()") for the autoinit subsystem. This is handled by the vtk_module_autoinit() function which uses the properties on dependent modules to determine what extra flags need to be generated.

Note that to truly work well, everything should be targets. External packages end up as imported targets. Note that this also means that the resulting install tree is a lot cleaner and less likely to have build-machine paths embedded.

Also, I should note that not everyone on this thread has access to the ECP Slack at all.

--Ben

@bartlettroscoe
Copy link
Member Author

bartlettroscoe commented Dec 4, 2020

From: Ben Boeckel
Sent: Tuesday, October 20, 2020 11:27 AM
To: Bartlett, Roscoe A
Subject: Re: [EXTERNAL] Re: Virtual meeting: "Migrating from TriBITS to Modern CMake"

On Thu, Oct 15, 2020 at 22:40:55 +0000, Bartlett, Roscoe A wrote:

[redacted]

One question that I have is why using targets for everything are
fundamentally better using (global) variables in CMake? For

Global variables are a problem because the namespace is constrained. One cannot easily nest two projects using tribits if they try to communicate and "fight" over which one "owns" global variables (or properties for that matter). VTK does communicate to itself via global properties, but it's an internal mechanism and any conflict is a conflict at the target level anyways (they're also namespaced so as to avoid conflicting with any user-level properties).

Using targets also means that if you have the target, you have the information. No scope jumping or similar shenanigans is necessary.

propagating certain types of information like include directories and
compiler defines, the advantage of using target and target properties
is clear. But for other types of information, it is not so clear to
me. For example, to custom properties on a target automatically get
written into a <Package>Config.cmake file when exporting targets and
loaded on those imported targets when reading in those targets with
find_package(<Package>)?

They can be. See:

VTK doesn't do this because the properties contain information needed at configure time and to do different variables for install and build configurations (primarily for header locations) this properly requires generator expressions. VTK has mechanisms to export the information manually though; it's not super complicated.

--Ben

@bartlettroscoe
Copy link
Member Author

bartlettroscoe commented Dec 4, 2020

From: Bartlett, Roscoe A
Sent: Friday, October 23, 2020 11:49 AM
To: Boeckel, Benjamin Ryan
Cc: Teranishi, Keita; Johnson, Seth (ORNL); Wilke, Jeremiah J
Subject: RE: [EXTERNAL] Re: Virtual meeting: "Migrating from TriBITS to Modern CMake"

Hello Ben,

CCing Keita, Seth, and Jeremy in on this thread ... (I don't know how to conduct a detailed conversation like this on Slack).

Would you have some time to talk with me next week?

Not really, no. The ParaView release needs me to tie some things up before the end of the month.

Would you have some time in early Nov? If not, is there another Kitware staff member with knowledge of the VTK module system to which I could ask these questions?

One question that I have is why using targets for everything are
fundamentally better using (global) variables in CMake? For

Global variables are a problem because the namespace is constrained. One cannot easily nest two projects using tribits if they try to communicate and "fight" over which one "owns" global variables (or properties for that matter).

TriBITS solves that problem by namespacing all variables by the package name such as <Package>_<SOME_VAR_NAME>. That way, as long as package names <Package> are carefully chosen, then there are never any conflicts. This requirement for unique package is spelled out at:

https://tribits.org/doc/TribitsDevelopersGuide.html#tribits-package

(which was violated by the usage of TriBITS in SCALE actually).

For that matter, don't the names of targets need to be globally unique across all packages and modules? So don't you have to prefix them with the Package name <Package> also (like <Package>::all_libs)?

So if you have to prefix targets and global variables, what is the real is the difference? Global and target properties have a disadvantage that you have to read them into a local var before you can use them in any CMake logic. Global vars don't have the problem and so the code is tighter and less verbose.

VTK does communicate to itself via global properties, but it's an internal mechanism and any conflict is a conflict at the target level anyways (they're also namespaced so as to avoid conflicting with any user-level properties).

Right, that is also what TriBITS does, but with namespaced variables.

Using targets also means that if you have the target, you have the information. No scope jumping or similar > shenanigans is necessary.

But again, the target name needs to be namespaces as well. A global var is a global var, there are no scope jumping issues.

One issue that we will have with the TriBITS refactoring to use modern CMake is that is that a "package" and a "target" are not the same thing. A "package" is an aggregation of targets, global variables, and other things. We will create aggregate targets like <Package>::all_libs as per the proposed spec here, so I guess you could make dummy targets for each TriBITS subpackage and package.

propagating certain types of information like include directories and
compiler defines, the advantage of using target and target properties
is clear. But for other types of information, it is not so clear to
me. For example, do custom properties on a target automatically get
written into a <Package>Config.cmake file when exporting targets and
loaded on those imported targets when reading in those targets with
find_package(<Package>)?

They can be. See:

https://cmake.org/cmake/help/latest/prop_tgt/EXPORT_PROPERTIES.html

VTK doesn't do this because the properties contain information needed at configure time and to do different variables or install and build configurations (primarily for header locations) this properly requires generator expressions. VTK has mechanisms to export the information manually though; it's not super complicated.

Thanks,

-Ross

@bartlettroscoe
Copy link
Member Author

From: Boeckel, Ben
Sent: Monday, November 16, 2020 10:25 AM
To: Bartlett, Roscoe A rabartl@sandia.gov
Cc: Zack Galbreath; Bill Hoffman
Subject: Re: [EXTERNAL] Re: Virtual meeting: "Migrating from TriBITS to Modern CMake"

[redacted]

[redacted] Everything after "One question that I have is why using targets for everything are" looks good to me.

--Ben

@bartlettroscoe
Copy link
Member Author

Relates to:

  • SEPW-214

@bartlettroscoe
Copy link
Member Author

CC: @fnrizzi, @marcinwrobel1986, @MikolajZuzek

FYI: Turns out that @ppebay was involved with the VTK project and knows a lot about VTK modules. He can help us distill what we can from the recent VTK Module refactoring effort.

@mathstuf
Copy link

Are there any further questions I can help answer? The author and implementer is right here too :) .

@bartlettroscoe
Copy link
Member Author

Are there any further questions I can help answer?

@mathstuf, thanks for the the offer. What is currently the best documentation describing the updated VTK Module system? Is it:

?

@mathstuf
Copy link

Yes, that's the place to start. All of the "Module" pages here should be useful though. Note that nitty-gritty details are only documented in the implementation itself, not via API docs.

@bartlettroscoe
Copy link
Member Author

@mathstuf, is there a small example project that uses VTK Modules that demonstrates how it works and allows you to play with it?

For example, TriBITS has some of these under:

@mathstuf
Copy link

A few things I see from just poking around (haven't run it locally yet):

  • I recommend lowercasing usage of the API calls. It's just easier to read (alas, CMake still has lots of uppercase anyways, but that's no reason to add even more uppercase everywhere)
  • Dependencies.cmake is CMake code. I banned this in VTK because dependencies should not be subject to arbitrary logic.
  • There still seems to be some magic going on (hello_world automatically links to hello_world_lib?).
  • Why is CMAKE_LEGACY_CYGWIN_WIN32 being messed with? That is…quite smelly to me. Cygwin is not Windows and should not be treated as such.
  • I'm curious why it is not find_package(TriBITS) rather than include(path/to/TriBITS.cmake)?

It is just fine to link a static library against an upstream shared library but the opposite is not true.

Why is this the case?

I question the composability of this right now. It seems that TriBITS completely takes over the project build rather than being able to build just a few bits with TriBITS, combine with a VTK module or two over here, pybind11 wrapping over there, and then tied together at the end. Maybe that's sufficient, but I think this looks hard to compose with other CMake APIs and such, but maybe the examples just aren't expressive enough here. Note that VTK's module system basically provides a toolbox with which to piece things together. Yes, this leaves a lot of boilerplate, but that boilerplate is really hard to get right in such a general manner. It allows VTK modules to consume non-VTK-module-driven projects, be consumed by the same, etc.

A few examples I'd like to see:

  • a TriBITS-using project consuming a TriBITS-unaware dependency (say, libpng)
  • a project that doesn't use TriBITS itself that uses a TriBITS-using dependency

@bartlettroscoe
Copy link
Member Author

@mathstuf,

Thanks for looking over this.

I recommend lowercasing usage of the API calls.

Yes, that would be nice. Upper case was the convention for the CMake build system for Trilinos in 2007 (started by Tim Shead) that TriBITS was created form and it stuck. Do you know of anyone who has written a tool that will lower-case the function, macro and command names and calls in CMake files? If one does not exist, would someone at Kitware like to write one? We have some funding for work like this. Then every project can use that tool to properly lower-case their function, macro, and command names and calls.

Dependencies.cmake is CMake code. I banned this in VTK because dependencies should not be subject to arbitrary logic.

It is actually really handy to allow arbitrary CMake code in these files if used carefully (I can shown examples even in Trilinos). Since 95% of these files just call a single macro tribits_package_define_dependenices() there is not much risk here. And it will break a lot of packages and various use cases if this was changed at this point so we can't change that (and I would not even if we could). But it is true that any mistakes in a Dependencies.cmake file will break the global configure, even if no packages get enabled. So bugs in any Dependencies.cmake file (even for packages that are never enabled) will bring down everything.

There still seems to be some magic going on (hello_world automatically links to hello_world_lib?)

Currently in TriBITS, the executables in a package are automatically linked to the libraries in that package. That is what happens 99% of the time in Trilinos, for example, so it made sense to make that the default to avoid boiler plate and mistakes. (That may lead to overlinking in some cases but that is rare.) Once TriBITS is refactored to use modern CMake for all targets (#299), this can be relaxed and allow optionally passing in the list of libraries to tribits_add_executable() (i.e. by adding an optional LIBS argument).

Why is CMAKE_LEGACY_CYGWIN_WIN32 being messed with?

That was needed for the Windows build to work at one point years ago. It may not be needed with current versions of CMake and Windows, who knows. Once we get a GitHub Actions Windows build added (as part of #363) we can revisit this.

I'm curious why it is not find_package(TriBITS) rather than include(path/to/TriBITS.cmake)?

It is not pointing to an install of TriBITS. Not sure how that model would apply to TriBITS in its source tree given the find rules of find_package() (except to just set TriBITS_ROOT to <base_dir>/tribits which is really the same thing as <Project>_TRIBITS_DIR). That can be addressed as part #114.

It is just fine to link a static library against an upstream shared library but the opposite is not true.

Why is this the case?

You can't link a shared lib to an upstream static lib unless the static lib is built with -fPIC. Where does it say that?

I question the composability of this right now. It seems that TriBITS completely takes over the project build rather than being able to build just a few bits with TriBITS, combine with a VTK module or two over here, pybind11 wrapping over there, and then tied together at the end.

That is the case with the current TriBITS but after refactoring to use modern target-based CMake, more flexibility will be allowed as per #342 (comment). The key to enabling this is adopting this proposed standard. If all CMake code adopts that standard, then all of this should work seamlessly.

But in general, trying to create big CMake projects out of CMakeLists.txt files that are developed by different teams using raw CMake is very risky and error prone unless everyone uses CMake in a very consistent way (at least that has been our past experience). Perhaps modern CMake will improve this situation but all it takes is one sloppy usage of CMAKE_CXX_FLAGS or of how MPI is used and you are screwed.

At some point, you have to give up and just wrap non-compliant CMake code in their own CMake project build and wrap them in a larger CMake project (like a super build with CMake ExternalProject does). That can be done with TriBITS even right now as demonstrated in the WrapExternal package.

By being so being so heavy handed with CMake, TriBITS has allowed single CMake projects with upwards of 500+ packages and subpackages and dozens of repositories by different teams to be integrated in to single CMake builds with one huge ninja dependency file. That has a lot of advantages. But we also want to support breaking these CMake projects into smaller CMake projects with more flexibility. That is the goal of #367.

A few examples I'd like to see:

  • a TriBITS-using project consuming a TriBITS-unaware dependency (say, libpng)

Can you elaborate on this use case?

  • a project that doesn't use TriBITS itself that uses a TriBITS-using dependency

You mean a project using raw CMake pulls in a pre-installed piece of software that uses TriBITS; like a down-stream raw CMake project that uses Trilinos? If that is the case, then sure, that happens all the time with Trilinos with the example shown at:

Note that VTK's module system basically provides a toolbox with which to piece things together. Yes, this leaves a lot of boilerplate, but that boilerplate is really hard to get right in such a general manner. It allows VTK modules to consume non-VTK-module-driven projects, be consumed by the same, etc.

Are any projects using VTK Modules other than the VTK project itself? Are there any simple examples of projects using VTK Modules?

@mathstuf
Copy link

Do you know of anyone who has written a tool that will lower-case the function, macro and command names and calls in CMake files? If one does not exist, would someone at Kitware like to write one? We have some funding for work like this. Then every project can use that tool to properly lower-case their function, macro, and command names and calls.

No, but this should at least be a 90% solution (with some find magic to figure out what files to apply it to):

awk -F\( '/^\s*[A-Z_]*\s*\(/ { $1 = tolower($1); $2 = "(" $2; print; next } { print }'

It does leave some stray whitespace around (namely cmd (, but that is probably much easier to find/replace).

It will only affect calls which are already all uppercase (ignoring things like ExternalProject_Add which are normally mixed case).

But it is true that any mistakes in a Dependencies.cmake file will break the global configure, even if no packages get enabled. So bugs in any Dependencies.cmake file (even for packages that are never enabled) will bring down everything.

It's not so much bugs that I'm worried about but things like:

if (obscure_config_condition)
  list(APPEND deps featuredep)
endif ()

tribits_package_define_dependenices(${deps})

Figuring out why featuredep is not available in a given build is now an exercise in figuring out what's going with that conditional. Instead, VTK provides OPTIONAL_DEPENDS for this kind of stuff (but then also restricts it to being a private dependency because API changing with things like that is also not friendly). APIs can differ based on platform, but that is way easier for downstream consumers to agree on.

Currently in TriBITS, the executables in a package are automatically linked to the libraries in that package. That is what happens 99% of the time in Trilinos, for example, so it made sense to make that the default to avoid boiler plate and mistakes.

OK, makes sense. VTK modules are, conceptually, each a single target, so that's where that difference lies.

Perhaps modern CMake will improve this situation but all it takes is one sloppy usage of CMAKE_CXX_FLAGS or of how MPI is used and you are screwed.

Agreed. I've found that using targets for all of this is way better. For project-wide flags, it's better to make a target that has the INTERFACE set up with everything you want and then just link to it privately from every target in the project. Futzing with the cache flags just leaves you open to other problems (such as conflicting with distro or other packaging mechanisms).

a TriBITS-using project consuming a TriBITS-unaware dependency (say, libpng)
Can you elaborate on this use case?

Yes. If I'm using TriBITS, do I need to do something to wrap every dependency I use or can I just use packages which provide CMake targets?

Are any projects using VTK Modules other than the VTK project itself? Are there any simple examples of projects using VTK Modules?

https://gitlab.kitware.com/vtk/vtk/-/blob/master/Examples/Modules/UsingVTK/CMakeLists.txt

For a more complicated project that doesn't use VTK for its main bits, but uses VTK's module system for the VTK-interacting bits:

https://gitlab.kitware.com/cmb/smtk

@bartlettroscoe
Copy link
Member Author

CC: @keitat

@mathstuf,

No, but this should at least be a 90% solution

It would be nice to also change the name of the function or macro being defined as well so if you have:

FUNCTION(SOME_FUNCTION_NAME ...)
   ...
ENDFUNCTION()

you would like it to be:

function(some_function_name ...)
   ...
endfunction()

I think a carefully written Python script could do this and preserve the desired whitespace.

But if you do this, this tool would need to be run on any branches that you might want to merge in or you will get major merge conflicts. (So the tool would need to be complete and very robust so its could be trusted to do everything correctly.)

Figuring out why featuredep is not available in a given build is now an exercise in figuring out what's going with that conditional.

Right, but if you don't allow that, then people will just take to generating the dependency files if they need this flexibility (thereby defeating the purpose of removing this flexibility).

With TriBITS, you can dump the dependencies at configure time so it is easy to see what dependencies are actually being defined if there is confusion. But this type of thing is very rare. But when it is needed, it is very useful.

I've found that using targets for all of this is way better.

But targets don't not change anything for how tests get run with MPI or CUDA or threads or avoid name conflicts for test names. You need a strong convention for how such tests are named, defined, and run. I don't trust software at all that builds if I don't see tests running and passing on the target platforms. It is possible to do that using just test support functions like tribits_add_test() and tribits_add_advanced_test(). You can use these in non-TriBITS CMake projects without the rest of the TriBITS framework stuff (and in fact I have used these functions in non-TriBITS projects several times).

Yes. If I'm using TriBITS, do I need to do something to wrap every dependency I use or can I just use packages which provide CMake targets?

With refactored TriBITS (#299), if the CMake software you are pulling in with add_subdirectory() adopts this standard for the target names then you should just be able to drop it in and go. But you would still need to define a Dependencies.cmake file so you would likely need to base TriBITS package directory with minimal base-level CMakeLists.txt and Dependencies.cmake files to make this work. But the wrapping base CMakeLists.txt file could be extremely minimal (could just be a call to add_subdirectory()).

Are any projects using VTK Modules other than the VTK project itself? Are there any simple examples of projects using VTK Modules?

https://gitlab.kitware.com/vtk/vtk/-/blob/master/Examples/Modules/UsingVTK/CMakeLists.txt

For a more complicated project that doesn't use VTK for its main bits, but uses VTK's module system for the VTK-interacting bits:

https://gitlab.kitware.com/cmb/smtk

No, I mean a simple CMake project that uses the VTK Module system but does not use VTK at all. Something small and fairly simple like:

but where TriBITS is stripped out and VTK Modules is used to control internal dependencies and what CMakeListx.txt files actually get processed in order. Does something like that exist?

My plan for #63 and #299 is to take the existing TribitsExampleProject and create several different versions using:

  • All raw CMake and manual interdependency logic to control what CMakeLists.txt files get processed (i.e. RawExampleProject) (This appears to what NGA is trying to scope out for Trilinos.)
  • Using refactored TriBITS just for the internal package interdependency logic but raw CMake for all build targets (i.e. TribitsExampleProjectRawTargets)
  • Using full-blown TriBITS (i.e. the current TribitsExampleProject with zero changes).

That will allow developers a clear apples-to-apples comparison of the different options and how well they scale (but this is still a really small example).

@bartlettroscoe
Copy link
Member Author

Dependencies.cmake is CMake code. I banned this in VTK because dependencies should not be subject to arbitrary logic.

@mathstuf, another point of reference on this but note the Spack allows the specification of dependencies to depend on runtime information from within the package's package.py file (see Conditional Dependencies). Now , Spack is trying to solve a more complex problem than either VTK Modules or TriBITS but the notion of allowing the runtime tweaking of dependencies is still relevant (to be used sparingly and in very rare cases).

@mathstuf
Copy link

Right, but if you don't allow that, then people will just take to generating the dependency files if they need this flexibility (thereby defeating the purpose of removing this flexibility).

I'm aware and that is…vaguely possible (the file would need to be generated in the source tree or the source tree copied to the binary tree for this to actually work). But at that point, someone should be asking themselves "why is this so difficult?" and discover the rationale. The main purpose of doing it in VTK is to not have the public API depend on various CMake configuration options. This makes it very hard for consumers of a module to know whether some API is available or not. It is far better to instead make a module that itself appears or disappears based on the flags. This is because if (TARGET) is way nicer for detecting "is this API available" rather than having to remember how the API is affected. This is why OPTIONAL_DEPENDS are forced to be private dependencies.

No, I mean a simple CMake project that uses the VTK Module system but does not use VTK at all

Well, the module system ships with VTK, so that's kind of difficult. But nothing stops that example project from not using any VTK:: targets and instead just doing whatever it wanted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Status: Done
Development

No branches or pull requests

2 participants