thrust::reduce_by_key error for 2^31 elements #1609

seunghwak · 2022-01-27T21:22:33Z

Tested environment
GA102 (48 GB GPU memory)
CUDA 11.4
Thrust 1.15.0

The code below failed inside the 'reduce_by_key' call with a memory allocation failure; tried to allocate 18446744073694639872 bytes (16 EB) and failed as my system has only 48 GB. This sounds like a bug related to 32 bit signed integer overflow.

cudaDeviceSynchronize();
std::cout << "test thrust reduce_by_key START" << std::endl;
constexpr size_t num_elements = std::size_t{1} << 31;
constexpr int32_t max_key = 8;
thrust::device_vector<int32_t> int_values(num_elements);
thrust::tabulate(thrust::device, int_values.begin(), int_values.end(), [max_key]__device__(auto i) {
  return static_cast<int32_t>(i % max_key);
});
thrust::sort(thrust::device, int_values.begin(), int_values.end());
thrust::device_vector<int32_t> keys(max_key);
thrust::device_vector<size_t> values(max_key);
thrust::reduce_by_key(thrust::device, int_values.begin(), int_values.end(), thrust::make_constant_iterator(size_t{1}), keys.data(), values.data());
cudaDeviceSynchronize();
std::cout << "test thrust reduce_by_key END" << std::endl;

The text was updated successfully, but these errors were encountered:

gevtushenko · 2022-04-26T16:33:12Z

I confirm the issue, here's a reproducer that doesn't require GPU with 48 GB:

#include <thrust/device_vector.h>
#include <thrust/tabulate.h>
#include <thrust/reduce.h>
#include <thrust/sort.h>
#include <thrust/iterator/constant_iterator.h>

int main() {
  constexpr size_t num_elements = std::size_t{1} << 31;
  constexpr char max_key = 8;
  thrust::device_vector<char> int_values(num_elements);
  thrust::tabulate(thrust::device, int_values.begin(), int_values.end(),
                   [max_key] __device__(auto i) {
                     return static_cast<char>(i % max_key);
                   });
  thrust::sort(thrust::device, int_values.begin(), int_values.end());
  thrust::device_vector<char> keys(max_key);
  thrust::device_vector<char> values(max_key);
  thrust::reduce_by_key(thrust::device, int_values.begin(), int_values.end(),
                        thrust::make_constant_iterator(char{1}), keys.data(),
                        values.data());
}

The issue is related to the following cast. I'll create a PR with a fix soon.

alliepiper assigned gevtushenko Jan 27, 2022

alliepiper added type: bug: functional Does not work as intended. P1: should have Necessary, but not critical. labels Jan 27, 2022

alliepiper added this to the 1.17.0 milestone Jan 27, 2022

alliepiper added the backend: CUDA Related to the CUDA backend label Apr 25, 2022

gevtushenko mentioned this issue Apr 27, 2022

Fix thrust::reduce_by_key for 2^31 elements #1671

Merged

alliepiper linked a pull request Apr 27, 2022 that will close this issue

Fix thrust::reduce_by_key for 2^31 elements #1671

Merged

gevtushenko closed this as completed in #1671 Apr 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

thrust::reduce_by_key error for 2^31 elements #1609

thrust::reduce_by_key error for 2^31 elements #1609

seunghwak commented Jan 27, 2022

gevtushenko commented Apr 26, 2022

thrust::reduce_by_key error for 2^31 elements #1609

thrust::reduce_by_key error for 2^31 elements #1609

Comments

seunghwak commented Jan 27, 2022

gevtushenko commented Apr 26, 2022