Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tpetra::CrsMatrix: Overlap communication & computation in apply() #385

Closed
mhoemmen opened this issue May 23, 2016 · 3 comments
Closed

Tpetra::CrsMatrix: Overlap communication & computation in apply() #385

mhoemmen opened this issue May 23, 2016 · 3 comments
Labels
CLOSED_DUE_TO_INACTIVITY Issue or PR has been closed by the GitHub Actions bot due to inactivity. MARKED_FOR_CLOSURE Issue or PR is marked for auto-closure by the GitHub Actions bot. pkg: Tpetra story The issue corresponds to a Kanban Story (vs. Epic or Task) TpetraRF

Comments

@mhoemmen
Copy link
Contributor

mhoemmen commented May 23, 2016

@trilinos/tpetra

Epic: #767.

Overlap communication & computation in the apply() method of Tpetra::CrsMatrix, which implements sparse matrix-vector multiply.

This depends on #384 working for Tpetra::MultiVector, which in turn depends on #383.

If we fix #439, then we can fix this issue without needing to change CrsGraph semantics. In particular, CrsGraph could still compute its Import from the domain Map to the (entire, with locals too) column Map, and code that relies on this Import could work unchanged. Sparse matrix-vector multiply implementations, such as those in CrsMatrix and BlockCrsMatrix (see #424), could then do coarse-grained overlap as follows:

  1. Start a nonblocking Import of the remotes
  2. Import the locals (if necessary)
  3. Do the local part of the mat-vec
  4. Finish the nonblocking Import of the remotes
  5. Do the remote part of the mat-vec (in place, in the row Map vector -- hence coarse-grained overlap)
  6. Do an analogous procedure to overlap the Export, if an Export is needed (if row Map != range Map)

The "if necessary" remark on Step 2 relates to whether the domain Map is "fitted" to the column Map (see #437 for a definition of "fitted"). If so, the local entries of the input vector would not need to be copied (see #435 and #436). If not, they would need to be copied (and/or permuted), but this copy could be per process. For example, processes with the same number of local entries and no entries that need permutation, would not need to make a copy: the local part of the mat-vec could just take the original input (multi)vector pointer as input. (In fact, the domain Map need not even be fitted; that's sufficient but not necessary.)

This approach has the following benefits over one that uses a different Import than the CrsGraph's domain -> column Map Import:

  1. CrsGraph's Import retains its current meaning
  2. Neither the graph nor the matrix would need to compute a new Import object just for the remotes
  3. This approach would work for any domain and column Maps, and in fact for any range and row Maps

In particular, this approach would work regardless of whether the domain Map is fitted to the column Map. The graph or matrix would not need to do any extra all-reduces to figure out if the Maps are fitted on all processes.

@mhoemmen
Copy link
Contributor Author

mhoemmen commented Sep 19, 2016

Summary of dependencies:

  1. Make doImport / doExport nonblocking (Tpetra::DistObject: Expose nonblocking versions of doExport & doImport #384)

#383 would benefit from #334
#384 depends on #383

  1. Be able to Import either locals or remotes separately (Tpetra::DistObject: Add "locals only" and "remotes only" options to doImport and doExport #439)
  2. Be able to separate "locals" from "remotes" in each row (e.g., Tpetra::CrsMatrix: Store matrix in such a way as to allow overlap of communication & computation #627)
  3. Put everything together in CrsMatrix::apply.

The same approach could (and likely would) also fix #435. If we Import the remotes into a separate MultiVector, then we could just use the input domain Map MultiVector directly in the sparse matrix-vector multiply kernel, as long as the domain Map is locally fitted to the column Map on that process (see #437; this is a per-process decision).

@mhoemmen mhoemmen modified the milestones: Tpetra-backlog, Tpetra-overlap-comm-compute Nov 2, 2016
@github-actions
Copy link

This issue has had no activity for 365 days and is marked for closure. It will be closed after an additional 30 days of inactivity.
If you would like to keep this issue open please add a comment and remove the MARKED_FOR_CLOSURE label.
If this issue should be kept open even with no activity beyond the time limits you can add the label DO_NOT_AUTOCLOSE.

@github-actions github-actions bot added the MARKED_FOR_CLOSURE Issue or PR is marked for auto-closure by the GitHub Actions bot. label Dec 16, 2020
@github-actions
Copy link

This issue was closed due to inactivity for 395 days.

@github-actions github-actions bot added the CLOSED_DUE_TO_INACTIVITY Issue or PR has been closed by the GitHub Actions bot due to inactivity. label Jan 16, 2021
@jhux2 jhux2 added this to Tpetra Aug 12, 2024
@jhux2 jhux2 moved this to Done in Tpetra Aug 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLOSED_DUE_TO_INACTIVITY Issue or PR has been closed by the GitHub Actions bot due to inactivity. MARKED_FOR_CLOSURE Issue or PR is marked for auto-closure by the GitHub Actions bot. pkg: Tpetra story The issue corresponds to a Kanban Story (vs. Epic or Task) TpetraRF
Projects
Status: Done
Development

No branches or pull requests

2 participants