Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using sycl::info::kernel_device_specific::max_sub_group_size gives undefined references #5071

Closed
masterleinad opened this issue Dec 2, 2021 · 16 comments
Labels
bug Something isn't working confirmed

Comments

@masterleinad
Copy link
Contributor

Describe the bug
Trying to use sycl::info::kernel_device_specific::max_sub_group_size gives undefined references.

To Reproduce
Compiling

#include <CL/sycl.hpp>

template <typename Functor>
struct Dummy {
  void operator()(sycl::id<1>) const {m_functor();}
  Dummy(Functor functor) : m_functor(functor) {}
private:
  Functor m_functor;
};

struct Foo
{
  void operator()()const{}
};

int main() {
  sycl::queue q;

  auto lambda = [](){};

//  using Functor = Dummy<Foo>;
//  Functor functor(Foo{});
  using Functor = Dummy<decltype(lambda)>;
  Functor functor(lambda);

  sycl::kernel_id id = sycl::get_kernel_id<Functor>();
  auto kernel = sycl::get_kernel_bundle<sycl::bundle_state::executable>(
    q.get_context(), std::vector{id}).get_kernel(id);

  auto num_regs = kernel.template get_info<
      sycl::info::kernel_device_specific::max_sub_group_size>(q.get_device());
  std::cout << num_regs << std::endl;

  q.submit([&] (cl::sycl::handler& cgh) {
    cgh.parallel_for(10, functor);
  });
}

via icpx -fsycl bla.cc gives

/usr/bin/ld: /tmp/dummy-993ed1.o: in function `main':
dummy-02d54d.cpp:(.text+0x760): undefined reference to `cl::sycl::info::param_traits<cl::sycl::info::kernel_device_specific, (cl::sycl::info::kernel_device_specific)8243>::return_type cl::sycl::kernel::get_info<(cl::sycl::info::kernel_device_specific)8243>(cl::sycl::device const&) const'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

Environment (please complete the following information):

  • OS: Linux
  • Target device and vendor: Intel GPU
  • DPC++ version: Nightly build from 2021/20/25
@masterleinad
Copy link
Contributor Author

kokkos/kokkos#4579 is related.

@bader
Copy link
Contributor

bader commented Dec 2, 2021

Interesting... this sample compiles with the latest nightly build, but I think this sample uses non-forward declarable kernel type name and I would expect the compiler to give an error here.

  using Functor = Dummy<decltype(lambda)>;
  Functor functor(lambda);
    cgh.parallel_for(10, functor);

For this invoking method we use a function type as a kernel type name and it seems to violate forward-declarable requirement.

@erichkeane, do you expect the compiler to report an error here?

@erichkeane
Copy link
Contributor

That is an 'unnamed kernel', which allows non-forward declarable type names, right? Or am I missing something in your example?

@bader
Copy link
Contributor

bader commented Dec 3, 2021

Compiler gives me my error with -fno-sycl-unnamed-lambda option. I forgot that it's now enabled by default.

The linker error is due to the bug in the provided reproducer.

  auto num_regs = kernel.template get_info<
      sycl::info::kernel_device_specific::max_sub_group_size>(q.get_device(), /* WOGK-GROUP SIZE IS MISSING */);

This API requires work-group size as input parameter.

See https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/SubGroup/SYCL_INTEL_sub_group.asciidoc#sub-group-queries

@bader bader added the invalid This doesn't seem right label Dec 3, 2021
@TApplencourt
Copy link
Contributor

TApplencourt commented Dec 3, 2021

Oh, "funny". The spec is far less clear https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#table.kernel.devicespecificinfo that this requires an input argument.

@erichkeane
Copy link
Contributor

It would seem to me that we'd want an error in this case, right? How come we don't validate the right parameters list there? I would expect a specialization of get_info to do that (or attempt to call something variadically that would otherwise fail...).

@masterleinad
Copy link
Contributor Author

Oh, "funny". The spec is far less clear https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#table.kernel.devicespecificinfo that this requires an input argument.

Yes, I was looking at the standard for the correct syntax. Apparently, the documentation was changed a while ago to require an additional sycl::range parameter, see 010f112#diff-093059fa654be6feaf5ae221be16698947a39b5170d10e9957a31f7b1649f39f.

@Pennycook
Copy link
Contributor

I think there are two problems here:

  1. We only define an overload with sycl::range<3> as an input type, when the extension spec says that 1- and 2-dimensional ranges should also be accepted.

  2. The extension specification and SYCL 2020 don't agree on whether the additional sycl::range parameter is required.

@brianwhitney59
Copy link

brianwhitney59 commented Dec 3, 2021

Making a small change to the code above, dealing with the missing parameter and removing the q.submit

#include <CL/sycl.hpp>
  
template <typename Functor>
struct Dummy {
  void operator()(sycl::id<1>) const {m_functor();}
  Dummy(Functor functor) : m_functor(functor) {}
private:
  Functor m_functor;
};

struct Foo
{
  void operator()()const{}
};

int main() {
  sycl::queue q;

  auto lambda = [](){};

  using Functor = Dummy<decltype(lambda)>;
  Functor functor(lambda);

  sycl::kernel_id id = sycl::get_kernel_id<Functor>();
  auto kernel = sycl::get_kernel_bundle<sycl::bundle_state::executable>(
    q.get_context(), std::vector{id}).get_kernel(id);

  auto num_regs = kernel.template get_info<
      sycl::info::kernel_device_specific::max_sub_group_size>(q.get_device(), sycl::range<3>{1, 1, 32});
  std::cout << num_regs << std::endl;
}

This returns a runtime error. Putting the q.submit back it and it runs on Gen9.

+ dpcpp bundle.cpp
+ ./a.out
terminate called after throwing an instance of 'cl::sycl::runtime_error'
  what():  No kernel found with the specified name -46 (CL_INVALID_KERNEL_NAME)
./build.sh: line 11: 20719 Aborted                 (core dumped) ./a.out

@bader
Copy link
Contributor

bader commented Dec 3, 2021

Removing q.submit removes kernel from the program, so this is expected.

@TApplencourt
Copy link
Contributor

TApplencourt commented Dec 9, 2021

Sorry if it's a stupid question (my comprehension of kernel_bundle is really limited), but my understanding is that one of the design goals of SYCL is that SYCL can be implemented as a library-only implementation. If this is the case, how can commenting q.submit who is after get info can change the behavior of the get info call?

@bader
Copy link
Contributor

bader commented Dec 9, 2021

According to understanding, library-only implementation of APIs introspecting device program/kernel/kernel_bundle either must throw an exception or return some pre-defined value. They can provide meaningful information only with help from the device compiler.

BTW, oneAPI DPC++ implementation is not library-only implementation.
I suggest confirming with ComputeCPP. Tagging @AerialMantis.

@romanovvlad
Copy link
Contributor

@masterleinad Could you please tell if the issue is resolved?

@romanovvlad romanovvlad self-assigned this Feb 8, 2022
@masterleinad
Copy link
Contributor Author

With the latest release

$ icpx --version
Intel(R) oneAPI DPC++/C++ Compiler 2022.0.0 (2022.0.0.20211123)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /home/6da/intel/oneapi/compiler/2022.0.1/linux/bin-llvm

I still get

/tmp/bla-6e2249.o: In function `main':
bla-830f02.cpp:(.text+0x750): undefined reference to `cl::sycl::info::param_traits<cl::sycl::info::kernel_device_specific, (cl::sycl::info::kernel_device_specific)8243>::return_type cl::sycl::kernel::get_info<(cl::sycl::info::kernel_device_specific)8243>(cl::sycl::device const&) const'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

and no proper error message saying that this call requires the workgroup size as a parameter. Thus, it still seems to not conform to the SYCL standard which says in https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#_queries:

template <typename Param>
typename Param::return_type get_info(const device &dev) const;

Preconditions: The Param must be one of the info::kernel_device_specific descriptors defined in [Table 133](https://www.khronos.org/registry/SYCL/specs/sycl-2020/html/sycl-2020.html#table.kernel.devicespecificinfo), and the type alias Param::return_type must be defined in accordance with that table.

Returns: Information about the kernel that applies when the kernel is invoked on the device dev.

Throws:

    An exception with the errc::invalid error code if the kernel is not compatible with device dev (as defined by is_compatible()).

@romanovvlad romanovvlad removed their assignment Mar 23, 2022
@romanovvlad romanovvlad removed the invalid This doesn't seem right label Mar 23, 2022
@maarquitos14
Copy link
Contributor

@masterleinad could you please try again with the latest release? I have just tried myself and I can't reproduce the error.

@masterleinad
Copy link
Contributor Author

I can confirm that this is resolved. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working confirmed
Projects
None yet
Development

No branches or pull requests

8 participants