Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close PSBT leaks found by Christian, fixes to detect them in future #4071

Merged

Conversation

rustyrussell
Copy link
Contributor

Interfacing with libwally without wrapping everything in our own struct is a pain, but this now manages it.

As an aside, there are cleanups and clarifications to the memleak interface.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This covers the obvious ones, but the later patches fix this more
seriously.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: Some memory leaks in transaction and PSBT manipulate closed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
It returns a wally_tx; it's an anti-pattern not to hand in a tal context.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since it returns a wally_tx.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 5677d0a

@@ -32,8 +32,12 @@ static struct wally_psbt *init_psbt(const tal_t *ctx, size_t num_inputs, size_t
else
wally_err = wally_psbt_init_alloc(0, num_inputs, num_outputs, 0, &psbt);
assert(wally_err == WALLY_OK);
tal_steal(ctx, psbt);
/* If both of these are zero, no sub-allocations were done */
if (num_inputs || num_outputs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any harm in calling it unconditionally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No harm, but we have a debug check for the moment that we don't call tal_gather_wally unless there's something to gather. Found some places where I was being overzealous.

Comment on lines +90 to +96
wally_leak = tal_first(wally_tal_ctx);
if (wally_leak) {
/* Trigger valgrind to tell us about this! */
tal_free(wally_leak);
*wally_leak = 0;
errx(1, "Outstanding wally allocations");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, this will report the location the non-collected child was allocated at in valgrind, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's how I found them!

@cdecker
Copy link
Member

cdecker commented Sep 21, 2020

After checking I can confirm that the memleaks are indeed gone:

memdump-after

However it seems like I produced a (slow) memory leak myself when precomputing the txids in the speedup PR, I'll need to look into dropping the b->txids at an opportune time. Afterwards we'll have a 115MB memory footprint, 104 of which is logs :-)

…tal ctx.

Since it allocates something, it needs a context (used in the next patch!).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This lets us reduce leaks, and ease their detection.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@rustyrussell rustyrussell force-pushed the guilt/notleak-more-precise branch 2 times, most recently from 2dd7e8f to f9721ff Compare September 22, 2020 20:40
rustyrussell and others added 7 commits September 23, 2020 11:07
…ons.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
They previously prevented any child from being detected as leaks, now
they just mark the tal allocation itself as not being a leak.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
The next patch perturbed things enough that we suddenly started
getting (with --track-origins=yes):

Valgrind error file: valgrind-errors.120470
==120470== Use of uninitialised value of size 8
==120470==    at 0x14EBD5: htable_val (htable.c:150)
==120470==    by 0x14EC3C: htable_firstval_ (htable.c:165)
==120470==    by 0x14F583: htable_del_ (htable.c:349)
==120470==    by 0x11825D: pointer_referenced (memleak.c:65)
==120470==    by 0x118485: scan_for_pointers (memleak.c:121)
==120470==    by 0x118500: memleak_remove_region (memleak.c:130)
==120470==    by 0x118A30: call_memleak_helpers (memleak.c:257)
==120470==    by 0x118A8B: call_memleak_helpers (memleak.c:262)
==120470==    by 0x118A8B: call_memleak_helpers (memleak.c:262)
==120470==    by 0x118B25: memleak_find_allocations (memleak.c:278)
==120470==    by 0x10EB12: closing_dev_memleak (closingd.c:584)
==120470==    by 0x10F3E2: main (closingd.c:783)
==120470==  Uninitialised value was created by a heap allocation
==120470==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==120470==    by 0x1604E8: allocate (tal.c:250)
==120470==    by 0x160AA9: tal_alloc_ (tal.c:428)
==120470==    by 0x119BE0: new_per_peer_state (per_peer_state.c:24)
==120470==    by 0x11A101: fromwire_per_peer_state (per_peer_state.c:95)
==120470==    by 0x10FB7C: fromwire_closingd_init (closingd_wiregen.c:103)
==120470==    by 0x10ED15: main (closingd.c:626)
==120470==

This is because there is uninitialized padding at the end of struct
peer_state.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
1. Rename memleak_enter_allocations to memleak_find_allocations.
2. Unify scanning for pointers into memleak_remove_region / memleak_remove_pointer.
3. Document the functions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
I mistakenly assumed the block would be freed after processing completed. That
is not true since chaintopology keeps headers and stubs around for reorgs. So
we need to remove the precomputed txids along with the full_txs.
@rustyrussell rustyrussell force-pushed the guilt/notleak-more-precise branch from f9721ff to d17a3c5 Compare September 23, 2020 01:37
@rustyrussell rustyrussell merged commit 1cb527d into ElementsProject:master Sep 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants