Close PSBT leaks found by Christian, fixes to detect them in future #4071

rustyrussell · 2020-09-21T10:25:58Z

Interfacing with libwally without wrapping everything in our own struct is a pain, but this now manages it.

As an aside, there are cleanups and clarifications to the memleak interface.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

This covers the obvious ones, but the later patches fix this more seriously. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: Some memory leaks in transaction and PSBT manipulate closed.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

It returns a wally_tx; it's an anti-pattern not to hand in a tal context. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Since it returns a wally_tx. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

cdecker

ACK 5677d0a

cdecker · 2020-09-21T11:25:27Z

bitcoin/psbt.c

@@ -32,8 +32,12 @@ static struct wally_psbt *init_psbt(const tal_t *ctx, size_t num_inputs, size_t
 	else
 		wally_err = wally_psbt_init_alloc(0, num_inputs, num_outputs, 0, &psbt);
 	assert(wally_err == WALLY_OK);
+	tal_steal(ctx, psbt);
+	/* If both of these are zero, no sub-allocations were done */
+	if (num_inputs || num_outputs)


Is there any harm in calling it unconditionally?

No harm, but we have a debug check for the moment that we don't call tal_gather_wally unless there's something to gather. Found some places where I was being overzealous.

cdecker · 2020-09-21T11:36:59Z

common/daemon.c

+	wally_leak = tal_first(wally_tal_ctx);
+	if (wally_leak) {
+		/* Trigger valgrind to tell us about this! */
+		tal_free(wally_leak);
+		*wally_leak = 0;
+		errx(1, "Outstanding wally allocations");
+	}


Very nice, this will report the location the non-collected child was allocated at in valgrind, correct?

Yes, that's how I found them!

cdecker · 2020-09-21T17:22:03Z

After checking I can confirm that the memleaks are indeed gone:

However it seems like I produced a (slow) memory leak myself when precomputing the txids in the speedup PR, I'll need to look into dropping the b->txids at an opportune time. Afterwards we'll have a 115MB memory footprint, 104 of which is logs :-)

…tal ctx. Since it allocates something, it needs a context (used in the next patch!). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

This lets us reduce leaks, and ease their detection. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

…ons. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

They previously prevented any child from being detected as leaks, now they just mark the tal allocation itself as not being a leak. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

The next patch perturbed things enough that we suddenly started getting (with --track-origins=yes): Valgrind error file: valgrind-errors.120470 ==120470== Use of uninitialised value of size 8 ==120470== at 0x14EBD5: htable_val (htable.c:150) ==120470== by 0x14EC3C: htable_firstval_ (htable.c:165) ==120470== by 0x14F583: htable_del_ (htable.c:349) ==120470== by 0x11825D: pointer_referenced (memleak.c:65) ==120470== by 0x118485: scan_for_pointers (memleak.c:121) ==120470== by 0x118500: memleak_remove_region (memleak.c:130) ==120470== by 0x118A30: call_memleak_helpers (memleak.c:257) ==120470== by 0x118A8B: call_memleak_helpers (memleak.c:262) ==120470== by 0x118A8B: call_memleak_helpers (memleak.c:262) ==120470== by 0x118B25: memleak_find_allocations (memleak.c:278) ==120470== by 0x10EB12: closing_dev_memleak (closingd.c:584) ==120470== by 0x10F3E2: main (closingd.c:783) ==120470== Uninitialised value was created by a heap allocation ==120470== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) ==120470== by 0x1604E8: allocate (tal.c:250) ==120470== by 0x160AA9: tal_alloc_ (tal.c:428) ==120470== by 0x119BE0: new_per_peer_state (per_peer_state.c:24) ==120470== by 0x11A101: fromwire_per_peer_state (per_peer_state.c:95) ==120470== by 0x10FB7C: fromwire_closingd_init (closingd_wiregen.c:103) ==120470== by 0x10ED15: main (closingd.c:626) ==120470== This is because there is uninitialized padding at the end of struct peer_state. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

1. Rename memleak_enter_allocations to memleak_find_allocations. 2. Unify scanning for pointers into memleak_remove_region / memleak_remove_pointer. 3. Document the functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

I mistakenly assumed the block would be freed after processing completed. That is not true since chaintopology keeps headers and stubs around for reorgs. So we need to remove the precomputed txids along with the full_txs.

rustyrussell added 6 commits September 21, 2020 14:27

bitcoin/psbt: more const pointers.

3b0c246

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

bitcoin/psbt: attach destructors to wally allocations to avoid leaks.

32f035d

This covers the obvious ones, but the later patches fix this more seriously. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: Some memory leaks in transaction and PSBT manipulate closed.

bitcoin/tx: trivial cleanups.

ddcadae

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

bitcoin/psbt: psbt_txid needs a tal ctx.

46ea7cb

It returns a wally_tx; it's an anti-pattern not to hand in a tal context. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

bitcoin/psbt: wallt_tx_output needs a tal ctx.

596298a

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

bitcoin/psbt: psbt_finalize needs a tal ctx.

4b7199c

Since it returns a wally_tx. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

rustyrussell added the bug label Sep 21, 2020

rustyrussell requested a review from cdecker September 21, 2020 10:25

cdecker approved these changes Sep 21, 2020

View reviewed changes

rustyrussell added 2 commits September 22, 2020 19:22

bitcoin/psbt: psbt_input_add_unknown/psbt_output_add_unknown needs a …

ea2681e

…tal ctx. Since it allocates something, it needs a context (used in the next patch!). Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

common: add tal_gather_wally() function to reparent libwally objs.

88e3c14

This lets us reduce leaks, and ease their detection. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

rustyrussell force-pushed the guilt/notleak-more-precise branch 2 times, most recently from 2dd7e8f to f9721ff Compare September 22, 2020 20:40

rustyrussell and others added 7 commits September 23, 2020 11:07

bitcoin: use tal_gather_wally() so we don't leave unattached allocati…

40ee7fc

…ons. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

common: enforce that we have collected all wally allocations.

8a9d7d5

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

memleak: make "_notleak" names less powerful.

b8f6a3c

They previously prevented any child from being detected as leaks, now they just mark the tal allocation itself as not being a leak. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

common: don't suppress leak detection for libwally allocations.

2fa1986

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

common/memleak: simplify and document API.

20a2476

1. Rename memleak_enter_allocations to memleak_find_allocations. 2. Unify scanning for pointers into memleak_remove_region / memleak_remove_pointer. 3. Document the functions. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

topo: Do not keep txids in memory indefinitely

d17a3c5

I mistakenly assumed the block would be freed after processing completed. That is not true since chaintopology keeps headers and stubs around for reorgs. So we need to remove the precomputed txids along with the full_txs.

rustyrussell force-pushed the guilt/notleak-more-precise branch from f9721ff to d17a3c5 Compare September 23, 2020 01:37

rustyrussell merged commit 1cb527d into ElementsProject:master Sep 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Close PSBT leaks found by Christian, fixes to detect them in future #4071

Close PSBT leaks found by Christian, fixes to detect them in future #4071

rustyrussell commented Sep 21, 2020

cdecker left a comment

cdecker Sep 21, 2020

rustyrussell Sep 22, 2020

cdecker Sep 21, 2020

rustyrussell Sep 22, 2020

cdecker commented Sep 21, 2020

Close PSBT leaks found by Christian, fixes to detect them in future #4071

Close PSBT leaks found by Christian, fixes to detect them in future #4071

Conversation

rustyrussell commented Sep 21, 2020

cdecker left a comment

Choose a reason for hiding this comment

cdecker Sep 21, 2020

Choose a reason for hiding this comment

rustyrussell Sep 22, 2020

Choose a reason for hiding this comment

cdecker Sep 21, 2020

Choose a reason for hiding this comment

rustyrussell Sep 22, 2020

Choose a reason for hiding this comment

cdecker commented Sep 21, 2020