Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Squashed 'thirdParty/mallocMC/' changes from e2533d1410..739236e9b4
739236e9b4 creating a single unit test executable if CUDA version is 10.2 or higher 6cc52008a3 moved test main into separate library d15d8b8f54 preventing nvcc from swallowing catch2 header for generating main() fef23d6787 simplified specifying workdivs in 2/3D case b2c6a09b74 removed quotes around compile options 476d08a745 renamed devAllocatorStorage into devAllocatorBuffer, since it's a buffer ec05ebd90a made the AlpakaAcc type a template parameter of DeviceAllocator's and AllocatorHandleImpl's methods instead of the classes themselves bd15540238 * dropped support for HCC * rearranged CUDA_ARCH and HIP macros as requested by psychocoderHPC 4aa77ae193 building tests as well 6e6accef39 fixed printf format specifiers cd97fe83c8 Explicitely specifying return types of alpaka min/max via decltype 15218c0010 * dropped all ReservePoolPolicies * reserving initial memory pool via alpaka buffer (which depends on the device with which the Allocator is constructed) 28f7bddd0d transformed dimensions.cpp tests into Catch2 unit tests a143ca8394 removed example2 from the examples target f0932e8040 * fixed specification of error encountered with alpaka min/max * applying fix only for MSVC 5921f83da4 * changed accelerators in examples and verify_heap_config.hpp back to CUDA * using Malloc<> reserve pool policy everywhere df9ccf0cf9 improved detection for when compiling with HIP device compiler f4b653514a modified tests to also have a grid size greater than 1 d248569246 * fixed remaining atomic intrinsic in XMallocSIMD * properly detecting which GPU backends are available in alpaka * added accelerator name to test output * testing also AccCpuThreads and AccCpuOmp2Threads * added a fix in alpaka to make MSVC compile 96aac162f6 parameterizing tests with accelerator to use 3f1428352e fixed test thread block size and added static check for this f5e90ec4f9 fixed a few warnings 7bfdc48e1c * switching to XMallocSIMD for dimensions tests * added Malloc<Acc> reservePoolPolicy, which uses SimpleCudaMalloc or SimpleMalloc depending on the used accelerator 1ded9068a4 changed alignment policies to work with size_t instead of uint32_t 3f62ab34ca allowing XMallocSIMD to compile without CUDA as well 40ed90e771 fixes for compiling with nvcc again 4b017be62f * buffer Dim does not need to be the same as Acc Dim * handling different dimensions of Acc * added a test executable * added simple test with 1-3D Acc eb21e925f2 * integrated some changes from hipifycation in alpaka * replaced mallocMC CUDA macros by alpaka macros, removed mallocMC_prefixes.hpp * replaced all CUDA kernel invocations by alpaka kernel enqueues * removed all code that targetted CUDA < 9 * merged example02 into example01 since they are almost the same * inlined content of mallocMC_example01_config.hpp * ported kernel invocations to alpaka * replaced cuda allocation routines by alpaka * renamed .cu source files to .cpp * reworked CMakeLists.txt (removed all CUDA stuff, removed big block comments, ...) * added new ReservePoolPolicies SimpleMalloc, intended for running allocator in host memory * passing Alpaka Accelerator through almost all device functions * replaced all atomit operations by alpaka atomics * replaced all CUDA intrinsics by custom implementations in mallocMC_utils.hpp, which default to the intrinsics of the corresponding platform or a default CPU implementation * tried to #ifdef some CUDA thread sync primitives * replaced CUDA thread IDs with alpaka indices and workdivs * replaced __shared__ memory by alpaka shared allocVar * SimpleCudaMalloc and XMallocSIMD are not available, when CUDA is not available, because they are too hard to port for now * refactored thread indexing * incorporating changes from psychocoderHPC from: alpaka-group/mallocMC@dev...psychocoderHPC:topic-hip-port * added a target mallocMCIde to CMakeLists.txt, so developers can browse the code in IDEs * setting compiler warnings via a warnings target, instead of global CMAKE_CXX_FLAGS * setting include directories on targets instead of globally * removed check for CUDA compute capability, since capability 3 is required since CUDA 9 * removed cudaSetDeviceFlags, as it's not needed a2ed3ae927 Merge pull request ComputationalRadiationPhysics#179 from bernhardmgruber/catch 8dd82718e9 added 3rd party catch.hpp and made CMake find it 67bdc1b598 Merge pull request ComputationalRadiationPhysics#176 from bernhardmgruber/addAlpaka 5dd1d036c4 switching to C++14 2bb2f3e6d9 * using alpaka from added git subtree ea73178789 Merge commit '90bb1ebc63d8281718381494e1d91733ac79c405' as 'alpaka' 90bb1ebc63 Squashed 'alpaka/' content from commit a5a8277cd e0be7743ea Merge pull request ComputationalRadiationPhysics#178 from psychocoderHPC/fix-travisCmakeUsage2 28b81447f2 fix used cmake version be77c4c606 Merge pull request ComputationalRadiationPhysics#174 from psychocoderHPC/topic-updateCmakeTo3.15.0 90d2841972 update cmake to 3.15.0 7a3d1cebde Merge pull request ComputationalRadiationPhysics#172 from bernhardmgruber/format 0185438323 added CONTRIBUTING.md with instructions how to use clang-format 4a416a2eca formatting (after clang-tidy) 8aec2cbf3f using trailing return types 5661a80fc3 formatting 669443d558 * setting column limit and allowing short loops * regrouping includes edc10db7c4 added .clang-format file 8c5a8617d1 Merge pull request ComputationalRadiationPhysics#171 from bernhardmgruber/cleaning 95c3223f2c replaced remaining typedefs by using directives d008ddba6b added .vs and build folders to ignores 5c16da78a6 applied clang-tidy 6404efdf61 * added a custom target for mallocMC headers * added header files to projects 43754615dd Merge pull request ComputationalRadiationPhysics#169 from bernhardmgruber/cleaning 28561514df * requiring only C++11 * removed a TODO 507111408b a little modernization of the CMakeLists * using CUDA via project language instead of deprecated find_package * setting CUDA standard to have C++14 inside CUDA as well 73e21de40d * removed check that pagesize is unsigned * made pagesizes signed literals again 25e0de3459 renamed variables with 2 leading underscores 240d4ea634 Suggested during review d37e9ed21e addressed review comments dafc9b7940 * replaced usage of boost::mpl by static constexpr members * dropped dependency on boost 4962156bf4 some cleanup * requiring C++14 * using cstdint instead of boost/cstdint.hpp * using std::tuple instead of boost::tuple * using nullptr * using static_assert * using constexpr * adding const and static where appropriate * removed a few empty lines * replaced std::endl by \n where flush was probably not intended e383f3cd89 Merge pull request ComputationalRadiationPhysics#170 from ax3l/topic-ciBionic 42aed7eafb Travis CI: GCC 5.5.0 + CUDA 9.1.85 36cb7f9f0c Merge pull request ComputationalRadiationPhysics#165 from sbastrakov/topic-nvccComputeCapabilityGuard d911d0cbb9 Add a guard around COMPUTE_CAPABILITY cmake variable eff012d664 Merge pull request ComputationalRadiationPhysics#161 from sbastrakov/topic-cudaDeviceGetArrribute efd20bce5b Merge pull request ComputationalRadiationPhysics#164 from sbastrakov/fix-nvccComputeCapability 450c73d3a7 Choose the value for the -arch nvcc flag depending on CUDA version ce377f18e7 Use cudaDeviceGetAttribute() for querying the compute capability git-subtree-dir: thirdParty/mallocMC git-subtree-split: 739236e9b44efd810f2eaad0fcf1313222a4d763
- Loading branch information