Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnxruntime: add with_cuda option #22557

Merged
merged 17 commits into from
Sep 4, 2024

Conversation

fdgStilla
Copy link
Contributor

Fix #22555

I started from this PR which is currently stale: #20392 and added the following changes:

Changes not directly related to cuda:

This was tested on Linux gcc11 and Windows msvc192 with and without the option.
Assuming you have a cuda package from the pre built libs provided by NVIDIA, this is the additional changes I had to do:

        # in both requirements and build_requirements
        if self.options.with_cuda:
            self.requires("cuda/11.4.2", transitive_headers=True)

        # in generate
        if self.options.with_cuda:
            tc.variables["onnxruntime_CUDA_HOME"] = self.dependencies["cuda"].package_folder.replace("\\", "/")
            tc.variables["onnxruntime_CUDNN_HOME"] = self.dependencies["cuda"].package_folder.replace("\\", "/")
            # cuda 11.4 does not support c++20
            tc.variables["CMAKE_CUDA_STANDARD"] = "17"

        # in self.cpp_info.requires
            "cuda::cuda"

@conan-center-bot

This comment has been minimized.

@ghost ghost mentioned this pull request Jan 26, 2024
3 tasks
@AbrilRBS AbrilRBS self-assigned this Jan 28, 2024
@Artalus
Copy link
Contributor

Artalus commented Jan 29, 2024

👋
We have forked the ORT recipe internally to support CUDA as well, so I will dare to leave a few comments on the topic from my experience 🙂

  • Your 1.16.1-0002-cuda-gsl.patch and 1.16.0-0002-cuda-gsl.patch are identical, so the .0 one can be applied in .1 part of conandata.yml as well. Also, I think the patch is valid for 1.16.2 and 1.16.3 too - it got proposed to upstream in fix reference to Microsoft.GSL::GSL in CMake build scripts when enabling cuda microsoft/onnxruntime#17843 and merged, but the change is not affecting the 1.16 "branch".

  • I added tc.variables["onnxruntime_USE_FLASH_ATTENTION"] = False in our fork, as by default (with it True) ORT will require the cutlass library, which is not in Conan.

  • In 1.16.3 (did not test on .1 and .2) we had a bunch of weird compilation errors in CUDA part of ORT sources:

/.conan/data/onnxruntime/1.16.3-kudan1/KdConanRecipes/pr-273/build/a790a3bbbda052f5d698683048ad7c0c4c286823/src/onnxruntime/contrib_ops/cuda/bert/packed_multihead_attention.cc:300:27: error: use 'template' keyword to treat 'GetScratchBuffer' as a dependent template name
  auto work_space = this->GetScratchBuffer<void>(workSpaceSize, context->GetComputeStream());
                          ^
                          template

On 1.15.1 with CUDA enabled we did not have this error. Might be a compiler mismatch of sorts; we use Clang-11, which was released in 2021. Would be nice if someone could verify this; maybe CCI should have patch for it as well.

  • For 1.15.1 I had to add this patch to get it building:
--- a/cmake/onnxruntime_providers.cmake
+++ b/cmake/onnxruntime_providers.cmake
@@ -479,7 +479,8 @@ if (onnxruntime_USE_CUDA)
     target_compile_options(onnxruntime_providers_cuda PRIVATE "$<$<COMPILE_LANGUAGE:CUDA>:SHELL:-Xcompiler /wd4127>")
   endif()

-  onnxruntime_add_include_to_target(onnxruntime_providers_cuda onnxruntime_common onnxruntime_framework onnx onnx_proto ${PROTOBUF_LIB} flatbuffers::flatbuffers)
+  onnxruntime_add_include_to_target(onnxruntime_providers_cuda onnxruntime_common onnxruntime_framework onnx_proto)
+  target_link_libraries(onnxruntime_providers_cuda PUBLIC onnx onnx_proto protobuf::protobuf flatbuffers::flatbuffers Eigen3::Eigen3)
   if (onnxruntime_ENABLE_TRAINING_OPS)
     onnxruntime_add_include_to_target(onnxruntime_providers_cuda onnxruntime_training)
     if (onnxruntime_ENABLE_TRAINING)
--- a/onnxruntime/contrib_ops/cuda/bert/attention.cc
+++ b/onnxruntime/contrib_ops/cuda/bert/attention.cc
@@ -164,7 +164,6 @@ Status Attention<T>::ComputeInternal(OpKernelContext* context) const {
                                         has_memory_efficient_attention(sm, sizeof(T) == 2);
 #else
   constexpr bool use_memory_efficient_attention = false;
-  ORT_UNUSED_VARIABLE(is_mask_1d_key_seq_len_start);
 #endif

   cublasHandle_t cublas = GetCublasHandle(context);

On Protobuf+Flatbuffers, I do not remember the exact errors, but I think their onnxruntime_add_include_to_target glitched with Conan targets somehow, and the include propagation was done by target_link_libraries() anyway.
On ORT_UNUSED_VARIABLE, it was a macro that got renamed at some point, and the "undefined name" error only manifests with the above FLASH_ATTENTION=False define - which is not default. This got fixed in 1.16 at https://github.com/microsoft/onnxruntime/pull/15983/files#diff-34ecf46f429b24ac811c0b67f45e8950f44c5ccd58719fd5dd82bbfc3a7d0cb5L167.

  • From what I understand, to be built with CUDA support, ORT requires the cuDNN library as well. It is rather tricky to get, as you require Nvidia Developer account to download the prebuilt binaries. I am unsure now what the behavior was without it, but I am fairly sure I created a cudnn Conan package locally not from a goodness of my heart. Did you manage to get ORT working without it?
  • Should the recipe be handling nvcc build requirement somehow? This is what I did in our fork:
    def build_requirements(self):
        # Required by upstream https://github.com/microsoft/onnxruntime/blob/v1.14.1/cmake/CMakeLists.txt#L5
        self.tool_requires("cmake/[>=3.24 <4]")
        self.tool_requires("ninja/1.11.1")

        # CUDA: cannot make CUDA a conan package, have to rely on system pre-configured
        if self.options.with_cuda and self.settings.os == 'Linux':
            if not shutil.which("nvcc"):
                self.output.error(
                    "Need CUDA toolkit! Here is how you can install it:\n"
                    "  wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.1-1_all.deb\n"
                    "  dpkg -i cuda-keyring_1.1-1_all.deb\n"
                    "  apt update\n"
                    "  apt install cuda-toolkit-11-8"
                )
                raise RuntimeError("nvcc not found; see logs")
            from subprocess import check_output
            output = check_output(["nvcc", "--version"]).decode()
            if not "release 11" in output:
                self.output.error(f"nvcc found but wrong version (11.8 recommended):\n\n{output}")
                raise RuntimeError("invalid nvcc version; see logs")
  • Finally, I found it is somewhat easier to make shared depend on with_cuda - the provider library is always built as shared even if you build everything else as static. I also noticed that libonnxruntime_providers_shared.so gets built even for shared=False && with_cuda=False, so I added this to def package():
        if not self.options.shared:
            rm(
                conanfile=self,
                pattern="*onnxruntime_providers_shared*",
                folder=os.path.join(self.package_folder, "lib"),
            )

@fdgStilla
Copy link
Contributor Author

fdgStilla commented Jan 29, 2024

I forgot to mention that I only tested onnxruntime 1.14.1, I guess I should try with this other versions and implement you changes if I get the same errors with my compiler.

To answer some of your questions:

  • I did not have to use cutlass
  • I included cuDNN in my custom cuda conan package
  • I also included nvcc in the bin folder of my custom cuda conan package, which is found when we add this dependency in the tool_requires section.
  • Indeed the provider libraries are always built as dll as described in their documention:

The onnxruntime code will look for the provider shared libraries in the same location as the onnxruntime shared library is (or the executable statically linked to the static library version).

https://onnxruntime.ai/docs/build/eps.html#loading-the-shared-providers, and the static build + shared providers is a well supported configuration so I think we should not add a limitation.

  • libonnxruntime_providers_shared.so is indeed always built but we shall check upstream if the cmake shall be modified to remove it from the build if no provider is built.

@fdgStilla
Copy link
Contributor Author

I updated the MR to be able to build 1.15.1 with cuda. However I cannot build the 1.16.x because I have only a cuda 11.4 configuration, which apparently is too old to work with VS2022 (required for 1.16.x).

But maybe this first version is enough and further adjustments can be in another MR (they still can be built without cuda like before)?

@czoido czoido assigned czoido and unassigned AbrilRBS Sep 2, 2024
@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot

This comment has been minimized.

@conan-center-bot
Copy link
Collaborator

Conan v1 pipeline ✔️

All green in build 6 (39e32ea14433ea5705d2fabfff0884630f48b954):

  • onnxruntime/1.16.3:
    All packages built successfully! (All logs)

  • onnxruntime/1.18.1:
    All packages built successfully! (All logs)

  • onnxruntime/1.14.1:
    All packages built successfully! (All logs)

  • onnxruntime/1.17.3:
    All packages built successfully! (All logs)

  • onnxruntime/1.15.1:
    All packages built successfully! (All logs)


Conan v2 pipeline ✔️

Note: Conan v2 builds are now mandatory. Please read our discussion about it.

All green in build 6 (39e32ea14433ea5705d2fabfff0884630f48b954):

  • onnxruntime/1.17.3:
    All packages built successfully! (All logs)

  • onnxruntime/1.18.1:
    All packages built successfully! (All logs)

  • onnxruntime/1.16.3:
    All packages built successfully! (All logs)

  • onnxruntime/1.15.1:
    All packages built successfully! (All logs)

  • onnxruntime/1.14.1:
    All packages built successfully! (All logs)

@conan-center-bot conan-center-bot merged commit db6b6f8 into conan-io:master Sep 4, 2024
28 checks passed
OMGtechy pushed a commit to OMGtechy/conan-center-index that referenced this pull request Dec 31, 2024
* Add with_cuda option

* Require static registration from onnx disabled

* Add some transitive headers

* Test CUDA in the test_package

* Add 1.15.1 patch

* wip

* add comment

* wip

* wip

* wip

* remove unused patch

* better patching

* remove unused patch

* fix

* copy dlls via cmake in v1

* check if win

---------

Co-authored-by: czoido <mrgalleta@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[onnxruntime] request to be able to use cuda [package] onnxruntime/1.14.1: This is an invalid model
5 participants