-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Asynchronous combi WIP #9
Conversation
Also changed the file to show relative error instead whatever wa sthere before
uncommented TIMING definition, for plots. Add modified test_manger that can handle any dimensions
…rtion to avoid segfault in case of multiple processes per group
…th minimal number of hierarchizations and dehierarchizations
static MPI_Request *requestAsync; | ||
|
||
/** SubspaceSized for Async Reduce*/ | ||
static std::vector<int> *subspaceSizes; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like this should not be a static member of Task
because it already exists in DistributedSparseGridUniform
, and I think that this is where it should be.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know this is in task because the Sparse Grid gets destroyed frequently but this information should be constant over multiple combinations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this is just an optimization so that the sparse grid sizes do not need to be communicated again. For the request array I would need to investigate if it is necessary to be in task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But one could also put the subspaceSizes as static member of the SparseGrid class I think...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right, we can leave it like this here. on third-level
, subspacesDataSizes_
is introduced which makes this redundant. I had not noticed that it wasn't on master.
I am just investigating a merge into https://github.com/SGpp/DisCoTec/tree/third-level , but seems that -- unsurprisingly -- some of the distributed sparse grid data management has changed for the widely distributed combination technique. This is why I thought it was the right time to leave a comment on the (in my opinion unneccessary) member Quoting @mhurler123 in case he has an opinion on that ;) |
|
||
Stats::startEvent("combine global reduce init"); | ||
|
||
Task::bufAsync = new std::vector<CombiDataType>[numGrids]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memory leak?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be the case. We could add a delete before. Or switch to smart pointers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or maybe make this a member of DistributedFullGridUniform
? that is also connected to the next question...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buffer is deleted in line 854
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also put this as a member to the fullgrid directly. But I am not sure if this is improves things.
} | ||
|
||
for(int g=0; g<numGrids; g++){ | ||
CombiCom::distributedGlobalReduceAsyncInit( *combinedUniDSGVector_[g], Task::subspaceSizes[g], Task::bufAsync[g], Task::requestAsync[g]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, I don't get why this is done once for each task -- a single time should suffice, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is only done a single time. The numGrids does not refer to the number of tasks but to the number of grids each task owns. Usually numGrids is 1 but for example with multiple species in GENE numGrids=nspecies.
assert(initialized_); | ||
|
||
int lrank; | ||
MPI_Comm_rank(lcomm, &lrank); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
=> auto lrank = theMPISystem()->getLocalRank();
-- use the buffered rank
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, it's unused anyways ;)
oops, this was unintentionally closed by making the |
No description provided.