Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Free memory from direct_iteration and direct_output #929

Merged
merged 4 commits into from
Apr 14, 2020

Conversation

bmunguia
Copy link
Member

Proposed Changes

This one's really small, but when running valgrind on some of my adaptation runs, I noticed that a lot of memory was being leaked due to not deleting direct_iteration and direct_output in CDiscAdjSinglezoneDriver.

Related Work

None that I'm aware of after a cursory look through PRs.

PR Checklist

  • I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags).
  • My contribution is commented and consistent with SU2 style.
  • I have added a test case that demonstrates my contribution, if necessary.
  • I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp) , if necessary.

Copy link
Member

@economon economon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

While you're at it, see any other memory that needs deleting? Or is the valgrind output leak-free otherwise?

@bmunguia
Copy link
Member Author

The output isn't leak-free otherwise. Those were just a couple of the bigger ones that were easy to track down. I think the rest are mainly CoDi/MeDi-related leaks, though. Keep in mind I ran this on feature_adap_sst (the line numbers below are for this branch though) so some of these might be resolved:

  • A leak in C2DContainers that Pedro already fixed in WIP Feature hybrid parallel and SIMD #895 so I was gonna leave it alone
  • Probably the largest remaining leak is from the calls to AD::StartExtFunc(), AD::SetExtFuncIn(), and AD::SetExtFuncOut() in CSysSolve.cpp during DirectRun(). These are all related to CoDi's ExternalFunctionHelper
  • Line 537 in adt_structure.cpp (CADTElemClass::CADTElemClass()). This seems to be related to various functions in MeDi including createLinearIndexCounts(), AMPI_Allgatherv() (might be a false positive), and ADToolImplCommon()
SU2_MPI::Allgatherv(val_coor.data(), sizeLocal, MPI_DOUBLE, coorPoints.data(),
                        recvCounts.data(), displs.data(), MPI_DOUBLE, MPI_COMM_WORLD);
  • Line 715 in CGeometry.cpp (CGeometry::PostP2PSends()). This is only called when val_reverse is false, which is duringSetDependencies() and DirectRun(). These all seem to be related to ADToolImplCommon()
SU2_MPI::Isend(&(bufD_P2PSend[offset]), count, MPI_DOUBLE,
                       dest, tag, MPI_COMM_WORLD, &(req_P2PSend[iMessage]));

I've attached the valgrind log from one of the cores below, in case anyone really wants to parse it
vg.txt

@bmunguia bmunguia requested a review from talbring April 10, 2020 02:00
@pcarruscag
Copy link
Member

We could change some of those things to unique_ptr

@bmunguia
Copy link
Member Author

@pcarruscag just to clarify, are you suggesting switching some of the containers, e.g. direct_iteration and direct_output, to unique_ptr, or some of the vars causing the remaining leaks? Because for some of the remaining leaks, I think it's more of an issue with how we interface with CoDi/MeDi, or maybe even the tools themselves.

For example, the destructor for ExternalFunctionHelper is

~ExternalFunctionHelper() {
  if(!isTapeActive) {
    delete data;
  }
}

but we only call AD::EndExtFunc() (which deletes FuncHelper) when the tape is active. Maybe there's a reason for this that @talbring can explain.
The leaks in MeDi all seem to stem from lines like

h = new AMPI_Allgatherv_AdjointHandle<SENDTYPE, RECVTYPE>();

and

WaitHandle* waitH = new WaitHandle((ReverseFunction)AMPI_Isend_b_finish<DATATYPE>,
                                           (ForwardFunction)AMPI_Isend_d<DATATYPE>, h);

in ampiFunctions.hpp, where h and waitH seem to be defined locally and are never freed (at least from what I see, but @talbring can correct me if I'm wrong).

@pcarruscag
Copy link
Member

Just the direct_* which should be a non intrusive change (I think those objects are just for internal usage of the adjoint driver) if it turns out to be too much work I would not bother.

Yeah even from MPI itself we get memory leaks, I don't think those are too bad though, if we leaked a lot of memory during normal iterations that would be a problem, but if it is just a few objects that are not destroyed when the code terminates... meh

Copy link
Member

@economon economon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the extra cleanup in SU2_SOL. I think this is good for now

@bmunguia bmunguia merged commit a2c28d8 into develop Apr 14, 2020
@bmunguia bmunguia deleted the fix_delete_direct branch April 14, 2020 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants