Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create executor in exec_map only if requested. #602

Merged
merged 4 commits into from
Aug 7, 2020

Conversation

greole
Copy link
Collaborator

@greole greole commented Jul 28, 2020

Fixes #594 Create executor only if requested using lambda functions inside the exec_map.

@codecov
Copy link

codecov bot commented Jul 28, 2020

Codecov Report

Merging #602 into develop will increase coverage by 8.85%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #602      +/-   ##
===========================================
+ Coverage    84.16%   93.01%   +8.85%     
===========================================
  Files          296      296              
  Lines        20656    20656              
===========================================
+ Hits         17385    19214    +1829     
+ Misses        3271     1442    -1829     
Impacted Files Coverage Δ
core/base/composition.cpp 73.84% <0.00%> (+1.53%) ⬆️
reference/solver/bicgstab_kernels.cpp 95.08% <0.00%> (+1.63%) ⬆️
core/matrix/hybrid.cpp 100.00% <0.00%> (+2.80%) ⬆️
core/test/utils/matrix_generator.hpp 100.00% <0.00%> (+3.03%) ⬆️
core/solver/bicg.cpp 87.80% <0.00%> (+3.65%) ⬆️
core/solver/gmres.cpp 99.06% <0.00%> (+3.73%) ⬆️
core/matrix/sellp.cpp 94.16% <0.00%> (+4.16%) ⬆️
core/solver/lower_trs.cpp 92.85% <0.00%> (+4.76%) ⬆️
core/solver/upper_trs.cpp 92.85% <0.00%> (+4.76%) ⬆️
core/solver/fcg.cpp 98.41% <0.00%> (+4.76%) ⬆️
... and 86 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95e0ff5...1d0dffa. Read the comment docs.

@thoasm thoasm added is:bug Something looks wrong. mod:cuda This is related to the CUDA module. reg:example This is related to the examples. mod:hip This is related to the HIP module. 1:ST:ready-for-review This PR is ready for review and removed mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. labels Jul 30, 2020
Copy link
Member

@thoasm thoasm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for tackling this! I would only propose to avoid generating a second OmpExecutor for the host-side data access, instead using the associated host-side executor of exec. This might introduce some unnecessary copies in case we assign data from one OmpExecutor to another one.

Copy link
Collaborator

@fritzgoebel fritzgoebel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@upsj upsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before we merge this, can you add yourself to the contributors.txt list?

Copy link
Member

@yhmtsai yhmtsai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sonarqubecloud
Copy link

sonarqubecloud bot commented Aug 6, 2020

SonarCloud Quality Gate failed.

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities (and Security Hotspot 0 Security Hotspots to review)
Code Smell A 0 Code Smells

0.0% 0.0% Coverage
100.0% 100.0% Duplication

warning The version of Java (1.8.0_121) you have used to run this analysis is deprecated and we will stop accepting it from October 2020. Please update to at least Java 11.
Read more here

@upsj upsj removed the 1:ST:ready-for-review This PR is ready for review label Aug 7, 2020
@upsj upsj added the 1:ST:ready-to-merge This PR is ready to merge. label Aug 7, 2020
@upsj
Copy link
Member

upsj commented Aug 7, 2020

Thanks for the contribution, you finished your first PR :) I guess we will merge @tcojean's PR first, but after that, you can merge yours (since you should already have developer permissions). Two important things about our workflow:

  1. We try to keep our Git history clean and linear, thus after the previous PR is merged, you should run git fetch && git rebase origin/develop to rebase your changes onto the latest develop status (You can see how far ahead/behind develop you are when clicking on the branch name
    Screenshot_2020-08-07 greole ginkgo

  2. In our merge commits, we aim to give a meaningful name (not the default one) and description. The description should contain the URL of the PR in the end, so you can easily access it even when not coming from the Github web interface.
    Screenshot from 2020-08-07 08-34-14

EDIT: I am not sure whether you actually have commit access, so maybe you just need to do the rebase, since I likely can't push to your fork.

@tcojean tcojean merged commit 78dc2f8 into ginkgo-project:develop Aug 7, 2020
@tcojean
Copy link
Member

tcojean commented Aug 7, 2020

Thanks for explaining the whole process @upsj. This time since we have a large backlog of PR to be merged I decided to go forward, but I advise @greole to take a look when he can on our merging policies. In the meantime, I will make sure to give you commit access.

tcojean added a commit that referenced this pull request Aug 26, 2020
Release 1.3.0 of Ginkgo.

The Ginkgo team is proud to announce the new minor release of Ginkgo version
1.3.0. This release brings CUDA 11 support, changes the default C++ standard to
be C++14 instead of C++11, adds a new Diagonal matrix format and capacity for
diagonal extraction, significantly improves the CMake configuration output
format, adds the Ginkgo paper which got accepted into the Journal of Open Source
Software (JOSS), and fixes multiple issues.

Supported systems and requirements:
+ For all platforms, cmake 3.9+
+ Linux and MacOS
  + gcc: 5.3+, 6.3+, 7.3+, all versions after 8.1+
  + clang: 3.9+
  + Intel compiler: 2017+
  + Apple LLVM: 8.0+
  + CUDA module: CUDA 9.0+
  + HIP module: ROCm 2.8+
+ Windows
  + MinGW and Cygwin: gcc 5.3+, 6.3+, 7.3+, all versions after 8.1+
  + Microsoft Visual Studio: VS 2017 15.7+
  + CUDA module: CUDA 9.0+, Microsoft Visual Studio
  + OpenMP module: MinGW or Cygwin.


The current known issues can be found in the [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues).


Additions:
+ Add paper for Journal of Open Source Software (JOSS). [#479](#479)
+ Add a DiagonalExtractable interface. [#563](#563)
+ Add a new diagonal Matrix Format. [#580](#580)
+ Add Cuda11 support. [#603](#603)
+ Add information output after CMake configuration. [#610](#610)
+ Add a new preconditioner export example. [#595](#595)
+ Add a new cuda-memcheck CI job. [#592](#592)

Changes:
+ Use unified memory in CUDA debug builds. [#621](#621)
+ Improve `BENCHMARKING.md` with more detailed info. [#619](#619)
+ Use C++14 standard instead of C++11. [#611](#611)
+ Update the Ampere sm information and CudaArchitectureSelector. [#588](#588)

Fixes:
+ Fix documentation warnings and errors. [#624](#624)
+ Fix warnings for diagonal matrix format. [#622](#622)
+ Fix criterion factory parameters in CUDA. [#586](#586)
+ Fix the norm-type in the examples. [#612](#612)
+ Fix the WAW race in OpenMP is_sorted_by_column_index. [#617](#617)
+ Fix the example's exec_map by creating the executor only if requested. [#602](#602)
+ Fix some CMake warnings. [#614](#614)
+ Fix Windows building documentation. [#601](#601)
+ Warn when CXX and CUDA host compiler do not match. [#607](#607)
+ Fix reduce_add, prefix_sum, and doc-build. [#593](#593)
+ Fix find_library(cublas) issue on machines installing multiple cuda. [#591](#591)
+ Fix allocator in sellp read. [#589](#589)
+ Fix the CAS with HIP and NVIDIA backends. [#585](#585)

Deletions:
+ Remove unused preconditioner parameter in LowerTrs. [#587](#587)

Related PR: #625
tcojean added a commit that referenced this pull request Aug 27, 2020
The Ginkgo team is proud to announce the new minor release of Ginkgo version
1.3.0. This release brings CUDA 11 support, changes the default C++ standard to
be C++14 instead of C++11, adds a new Diagonal matrix format and capacity for
diagonal extraction, significantly improves the CMake configuration output
format, adds the Ginkgo paper which got accepted into the Journal of Open Source
Software (JOSS), and fixes multiple issues.

Supported systems and requirements:
+ For all platforms, cmake 3.9+
+ Linux and MacOS
  + gcc: 5.3+, 6.3+, 7.3+, all versions after 8.1+
  + clang: 3.9+
  + Intel compiler: 2017+
  + Apple LLVM: 8.0+
  + CUDA module: CUDA 9.0+
  + HIP module: ROCm 2.8+
+ Windows
  + MinGW and Cygwin: gcc 5.3+, 6.3+, 7.3+, all versions after 8.1+
  + Microsoft Visual Studio: VS 2017 15.7+
  + CUDA module: CUDA 9.0+, Microsoft Visual Studio
  + OpenMP module: MinGW or Cygwin.


The current known issues can be found in the [known issues page](https://github.com/ginkgo-project/ginkgo/wiki/Known-Issues).


Additions:
+ Add paper for Journal of Open Source Software (JOSS). [#479](#479)
+ Add a DiagonalExtractable interface. [#563](#563)
+ Add a new diagonal Matrix Format. [#580](#580)
+ Add Cuda11 support. [#603](#603)
+ Add information output after CMake configuration. [#610](#610)
+ Add a new preconditioner export example. [#595](#595)
+ Add a new cuda-memcheck CI job. [#592](#592)

Changes:
+ Use unified memory in CUDA debug builds. [#621](#621)
+ Improve `BENCHMARKING.md` with more detailed info. [#619](#619)
+ Use C++14 standard instead of C++11. [#611](#611)
+ Update the Ampere sm information and CudaArchitectureSelector. [#588](#588)

Fixes:
+ Fix documentation warnings and errors. [#624](#624)
+ Fix warnings for diagonal matrix format. [#622](#622)
+ Fix criterion factory parameters in CUDA. [#586](#586)
+ Fix the norm-type in the examples. [#612](#612)
+ Fix the WAW race in OpenMP is_sorted_by_column_index. [#617](#617)
+ Fix the example's exec_map by creating the executor only if requested. [#602](#602)
+ Fix some CMake warnings. [#614](#614)
+ Fix Windows building documentation. [#601](#601)
+ Warn when CXX and CUDA host compiler do not match. [#607](#607)
+ Fix reduce_add, prefix_sum, and doc-build. [#593](#593)
+ Fix find_library(cublas) issue on machines installing multiple cuda. [#591](#591)
+ Fix allocator in sellp read. [#589](#589)
+ Fix the CAS with HIP and NVIDIA backends. [#585](#585)

Deletions:
+ Remove unused preconditioner parameter in LowerTrs. [#587](#587)

Related PR: #627
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1:ST:ready-to-merge This PR is ready to merge. is:bug Something looks wrong. reg:example This is related to the examples.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Segfaults due to executor generation in examples
7 participants