Skip to content

Commit

Permalink
Merge remote-tracking branch 'github/sycl' into ur_cmdbuf_enqueue
Browse files Browse the repository at this point in the history
  • Loading branch information
EwanC committed Feb 11, 2025
2 parents b406585 + 36ce10e commit b504a50
Show file tree
Hide file tree
Showing 109 changed files with 2,102 additions and 1,841 deletions.
5 changes: 5 additions & 0 deletions .github/workflows/sycl-post-commit.yml
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,11 @@ jobs:
runner: '["Linux", "arc"]'
extra_lit_opts: --param matrix-xmx8=True
reset_intel_gpu: true
- name: Intel Battlemage Graphics with Level Zero
runner: '["Linux", "bmg"]'
target_devices: level_zero:gpu
# The new Xe kernel driver used by BMG doesn't support resetting.
reset_intel_gpu: false
- name: AMD/HIP
runner: '["Linux", "amdgpu"]'
image_options: -u 1001 --device=/dev/dri --device=/dev/kfd
Expand Down
8 changes: 0 additions & 8 deletions clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2362,14 +2362,6 @@ Expected<SmallVector<StringRef>> linkAndWrapDeviceFiles(
HasNonSYCLOffloadKinds = true;
}

// Write any remaining device inputs to an output file.
SmallVector<StringRef> InputFiles;
for (const OffloadFile &File : Input) {
auto FileNameOrErr = writeOffloadFile(File);
if (!FileNameOrErr)
return FileNameOrErr.takeError();
InputFiles.emplace_back(*FileNameOrErr);
}
if (HasSYCLOffloadKind) {
SmallVector<StringRef> InputFiles;
// Write device inputs to an output file for the linker.
Expand Down
8 changes: 4 additions & 4 deletions devops/dependencies-igc-dev.json
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
{
"linux": {
"igc_dev": {
"github_tag": "igc-dev-61b96b3",
"version": "61b96b3",
"updated_at": "2025-01-15T17:43:30Z",
"url": "https://api.github.com/repos/intel/intel-graphics-compiler/actions/artifacts/2435370337/zip",
"github_tag": "igc-dev-4cc8dff",
"version": "4cc8dff",
"updated_at": "2025-02-10T10:27:30Z",
"url": "https://api.github.com/repos/intel/intel-graphics-compiler/actions/artifacts/2564401848/zip",
"root": "{DEPS_ROOT}/opencl/runtime/linux/oclgpu"
}
}
Expand Down
2 changes: 1 addition & 1 deletion devops/scripts/install_build_tools.sh
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ apt update && apt install -yqq \
python3-psutil \
python-is-python3 \
python3-pip \
zstd \
ocl-icd-opencl-dev \
vim \
libffi-dev \
Expand All @@ -21,6 +20,7 @@ apt update && apt install -yqq \
zstd \
zip \
unzip \
pigz \
jq \
curl \
libhwloc-dev \
Expand Down
2 changes: 1 addition & 1 deletion devops/scripts/update_drivers.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ def uplift_linux_igfx_driver(config, platform_tag, igc_dev_only):
config[platform_tag]["igc_dev"]["version"] = igcdevver
config[platform_tag]["igc_dev"]["updated_at"] = igc_dev["updated_at"]
config[platform_tag]["igc_dev"]["url"] = get_artifacts_download_url(
"intel/intel-graphics-compiler", "IGC_Ubuntu22.04_llvm14_clang-" + igcdevver
"intel/intel-graphics-compiler", "IGC_Ubuntu24.04_llvm14_clang-" + igcdevver
)
return config

Expand Down
48 changes: 48 additions & 0 deletions sycl/doc/developer/KHRExtensions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Considerations for working on KHR extensions

SYCL specification evolves through embedding extensions developed by various
vendors, including Khronos Group itself (`khr` extensions).

In order for a KHR extension to be accepted, there must be CTS tests for it and
at least one implementation which passes them.

Considering that KHR extensions are being developed in public, we can start
prototyping them as soon as corresponding PR for an extension is published at
KhronosGroup/SYCL-Docs.

However, we shouldn't be exposing those extensions to end users until the
extension if finalised, ratified and published by Khronos - due to risk of an
extension changing during that process and lack of the officially published
version of it.

So, we can have a PR but can't merge it. Keeping PRs opened for a long time is a
bad practice, because they tend to get stale: there are merge conflicts,
potential functional issues due to the codebase changes, etc.

In order for us to avoid stale PRs, all functionality which is a public
interface of an "in-progress" KHR extension, must be hidden under
`__DPCPP_ENABLE_UNFINISHED_KHR_EXTENSIONS` macro. That way we can merge a PR to
avoid constantly maintaining it in a good shape, start automatically testing it
but at the same time avoid exposing incomplete and/or undocumented feature to
end users just yet.

"in-progress" KHR extension term used above is defined as:
- PR proposing a KHR extension has not been merged/cherry-picked to `sycl-2020`
branch of KhronosGroup/SYCL-Docs.

That only happens after all formal processes on Khronos Group side are
completed so an extension can be considered good and stable to be released by
us.

Note: merge of an extension proposal PR into `main` branch of
KhronosGroup/SYCL-Docs repo is **not** enough.
- Published (i.e. the above bullet complete) KHR extension, which hasn't been
fully implemented by us

The macro is **not** intended to be used by end users and its purpose is to
simplify our development process by allowing us to merge implementation (full
or partial) of the aforementioned extensions earlier to simplify maintenance and
enable automated testing.

Due to this reason, we are not providing a separate macro for each "in-progress"
KHR extension we may (partially) support, but just a single guard.
Original file line number Diff line number Diff line change
Expand Up @@ -101,11 +101,13 @@ in the group.
and default constructible.
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`

_Mandates_: If `Properties` contains the `alignment` property, `InputIteratorT` must be a pointer.

_Effects_: Loads single element from `in_iter` to `out` by using the `g` group
object to identify memory location as `in_iter` + `g.get_local_linear_id()`.

Properties may provide xref:optimization_properties[assertions] which can
enable better optimizations.
Properties may provide xref:optimization_properties[assertions] or the `alignment` property
which can enable better optimizations.

==== `sycl::vec` Overload

Expand All @@ -132,6 +134,8 @@ in the group.
and default constructible.
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`

_Mandates_: If `Properties` contains the `alignment` property, `InputIteratorT` must be a pointer.

_Effects_: Loads `N` elements from `in_iter` to `out`
using the `g` group object.
Properties may specify xref:data_placement[data placement].
Expand All @@ -140,8 +144,9 @@ Default data placement is a blocked one:
in striped case:
`out[i]` = `in_iter[g.get_local_linear_id() + g.get_local_linear_range() * i];`
for `i` between `0` and `N`.
Properties may also provide xref:optimization_properties[assertions] which can
enable better optimizations.
Properties may also provide xref:optimization_properties[assertions] or the `alignment` property
which can enable better optimizations.


==== Fixed-size Array Overload

Expand Down Expand Up @@ -169,6 +174,8 @@ work-group or sub-group.
and default constructible.
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`

_Mandates_: If `Properties` contains the `alignment` property, `InputIteratorT` must be a pointer.

_Effects_: Loads `ElementsPerWorkItem` elements from `in_iter` to `out`
using the `g` group object.
Properties may specify xref:data_placement[data placement].
Expand All @@ -177,8 +184,9 @@ Default placement is a blocked one:
in striped case:
`out[i]` = `in_iter[g.get_local_linear_id() + g.get_local_linear_range() * i];`
for `i` between `0` and `ElementsPerWorkItem`.
Properties may also provide xref:optimization_properties[assertions] which can
enable better optimizations.
Properties may also provide xref:optimization_properties[assertions] or the `alignment` property
which can enable better optimizations.



=== Store API
Expand Down Expand Up @@ -209,11 +217,13 @@ in the group.
and default constructible.
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`

_Mandates_: If `Properties` contains the `alignment` property, `OutputIteratorT` must be a pointer.

_Effects_: Stores single element `in` to `out_iter` by using the `g` group
object to identify memory location as `out_iter` + `g.get_local_linear_id()`

Properties may provide xref:optimization_properties[assertions] which can
enable better optimizations.
Properties may provide xref:optimization_properties[assertions] or the `alignment` property
which can enable better optimizations.


==== `sycl::vec` Overload
Expand Down Expand Up @@ -241,6 +251,8 @@ in the group.
and default constructible.
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`

_Mandates_: If `Properties` contains the `alignment` property, `OutputIteratorT` must be a pointer.

_Effects_: Stores `N` elements from `in` vec to `out_iter`
using the `g` group object.
Properties may specify xref:data_placement[data placement].
Expand All @@ -249,8 +261,8 @@ Default placement is a blocked one:
in striped case:
`out_iter[g.get_local_linear_id() + g.get_local_linear_range() * i]` = `in[i];`
for `i` between `0` and `N`.
Properties may also provide xref:optimization_properties[assertions] which can
enable better optimizations.
Properties may also provide xref:optimization_properties[assertions] or the `alignment` property
which can enable better optimizations.


==== Fixed-size Array Overload
Expand Down Expand Up @@ -280,6 +292,8 @@ work-group or sub-group.
and default constructible.
* `Properties` is an instance of `sycl::ext::oneapi::experimental::properties`

_Mandates_: If `Properties` contains the `alignment` property, `OutputIteratorT` must be a pointer.

_Effects_: Stores `ElementsPerWorkItem` elements from `in` span to `out_iter`
using the `g` group object.

Expand All @@ -289,8 +303,9 @@ Default placement is a blocked one:
in striped case:
`out_iter[g.get_local_linear_id() + g.get_local_linear_range() * i]` = `in[i];`
for `i` between `0` and `ItemsPerWorkItem`.
Properties may also provide xref:optimization_properties[assertions] which can
enable better optimizations.
Properties may also provide xref:optimization_properties[assertions] or the `alignment` property
which can enable better optimizations.


=== Data Placement

Expand Down Expand Up @@ -442,6 +457,23 @@ so the implementation can rely on `get_max_local_range()` range size:

If partition is uneven the behavior is undefined.

== Alignment

If `InputIteratorT`/`OutputIteratorT` is a pointer then the following property can be used
to provide an alignment of the pointer. It can allow to avoid dynamic alignment check.

```c++
namespace sycl::ext::oneapi::experimental {
struct alignment_key {
template <int K>
using value_t = property_value<alignment_key, std::integral_constant<int, K>>;
};

template<int K>
inline constexpr alignment_key::value_t<K> alignment;
} // namespace sycl::ext::oneapi::experimental
```

== Usage Example

Example shows the simplest case without local memory usage of blocked load
Expand All @@ -458,8 +490,8 @@ constexpr std::size_t block_count = 2;
constexpr std::size_t size = block_count * block_size * items_per_thread;
sycl::queue q;
T* input = sycl::malloc_device<T>(size, q);
T* output = sycl::malloc_device<T>(size, q);
T* input = sycl::aligned_alloc_device<T>(16, size, q);
T* output = sycl::aligned_alloc_device<T>(16, size, q);
q.submit([&](sycl::handler& cgh) {
cgh.parallel_for(
Expand All @@ -472,7 +504,7 @@ q.submit([&](sycl::handler& cgh) {
auto offset = g.get_group_id(0) * g.get_local_range(0) *
items_per_thread;
auto props = sycl_exp::properties{sycl_exp::contiguous_memory};
auto props = sycl_exp::properties{sycl_exp::contiguous_memory, sycl_exp::alignment<16>};
sycl_exp::group_load(g, input + offset, sycl::span{ data }, props);
Expand Down
1 change: 1 addition & 0 deletions sycl/doc/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,4 @@ Developer Documentation
developer/DockerBKMs
developer/ABIPolicyGuide
developer/ContributeToDPCPP
developer/KHRExtensions
73 changes: 0 additions & 73 deletions sycl/include/sycl/builtins_utils_scalar.hpp

This file was deleted.

Loading

0 comments on commit b504a50

Please sign in to comment.