A PR by theStack to add test coverage for the dust mempool policy. Participants discussed the approach of the test, the importance of clear and maintainable tests and the benefits of unit tests. They also discussed the concept of dust as an anti-DoS measure, the potential effect of P2TR on dust, the possibility of changing the default value of -dustrelayfee
in the future, and the existence of standard but unspendable output scripts. They also explored the idea of considering transactions that don't meet the dust threshold as invalid.
What would be a common reason to NACK a PR that adds testing coverage? 🔗
-
Tests require maintenance too. A test should be clear enough such that, in the future, if it fails, we know what's gone wrong in the code. If the test makes a very narrow requirement for the result, then it can break in the future when there's not really anything wrong.
-
Another reason for NACKing could be if the suggested test could make more sense as a unit test instead of a functional test. We'll always have the functional tests, but the more that can be tested in unit test the better:
- when something fails in a unit test, it's often much easier to narrow down where the problem is, because you're not running as much code.
- functional tests run one or more full nodes, and there are more chances for false failures due to timing windows or things like, shutting down and restarting nodes being less reliable.
- unit tests can run MUCH faster than functional tests for a given amount of actual testing as you don't have the delays in starting up the nodes, etc.
- For example, in an earlier version of this PR the test used multiple nodes, one for each config options (like it's currently done e.g. in
mempool_datacarrier.py
), and even with a small number of nodes, the test took significantly longer to be setup, so it was changed to one node that is just restarted.
- For example, in an earlier version of this PR the test used multiple nodes, one for each config options (like it's currently done e.g. in
- also related is making code changes that make it possible to test with unit tests. The refactoring effort in the P2P layer has this as one of its goals. see #25515 - Virtualise CConnman and add initial PeerManager unit tests.
-
On the topic of tests, sometimes keeping them extremely simple is is better than being efficient. This is an example, where the sequence of tests could have been written with some clever loop, but then the reader would have to understand how the loop works and confirm that it's correct. The way it's actually written make it SO obvious.
How does the test work? What is its approach? 🔗
- It tests various combinations. It creates all posible Script types including a couple future SegWit versions, and try each of this scripts on a list of
-dustrelayfee
arbitrary settings, including the default of-dustrelayfee
of 3000 sats/kvB.
Why is the concept of dust useful? What problems might occur if it didn’t exist? 🔗
- Similar to many other things in Bitcoin Core, the dust limit is an anti-DoS measure.
- Dust could be used as an attack vector by enlarging the utxo set, as dust has to be accounted for by fullnodes, and if there are 50million dust outpusts to account for, then fullnodes get...tired. So we preempt the attack by creating the idea that there are transactions that are too small to be valid.
bonus: Thoughts on the concept of dust becoming less useful as P2TR gets more widely adopted (more folks using scripts and the cost to spent the output being largely unknown)? 🔗
- P2TR is no different to P2SH or P2WSH, in the sense that script is still hashed until you come to spend it. The difference is that with P2TR you can bring along complex scripts "for free" (keyspend case) and since the dust caclulation has to sort of "guess" at how big the future spending input will be, with P2TR, that's gets harder.
- For more information on taproot, key-path and script-spends see the schorr-taproot-workshop.
The P2TR output script to test is created with pubkey[1:]. What does this expression do and why is this needed? 🔗
- This expression list the bytes object skipping the first byte, since
pubkey
is a compressed public key, the first byte indicatex02=even
,x03=odd
. This piece of data plus thex
coordinate (bytes 2 to 33) is used to calculatey
in compressed keys. - P2TR uses x-only-pubkeys, therefore the first byte is not needed.
- Theoretically the
output_key_to_p2tr_script
helper could be adapted to accept both legacy and x-only-pubkeys, by looking at size, but it's not sure if we would really need it that often.
What is the difference between valid transaction and a non-standard transaction? 🔗
- Dust transactions are technically still valid and can be mined into valid blocks. However these transactions are considered non-standard, which means that they are not relayed to other nodes.
- Nothings stops a "malicious" miner from accepting dust, but by not forwarding, it's much less likely that a miner will ever see transactions with dust outputs.
Would it be better if transactions that don’t meet the dust treshhold were considered invalid? 🔗
- One possible objection to this, is that if we make dust part of consensus, then we could never lower it later, because that would be relaxing a rule (tx that were previously illegal, now legal), which would be a hardfork. We could raise the dust limit, that would be a softfork.
Can you see a future scenario where we’d want to change the default value of -dustrelayfee
? Would it more likely be increased or decreased? What does this depend on and which other configuration options would then also very likely be adapted? 🔗
- If in the future blocks are more or less constantly full with a minimum fee-rate of hundreds of sats/vbyte, then that would a reason to increase the default dust-limit In that scenario, the
incrementalrelayfee=
option might need to change also.- Fun fact: There was a period of high fees in 2017 and due to poor UTXO management, Coinbase had a bunch of small value UTXOs that were worth less than the fee to spend them. Those were conslidated a few months later.
- Another point that complicates the dust idea is that theres a lot of non BTC value stored in BTC outputs. This is due to protocols built on Bitcoin (like counterparty) that allowed issuance of tokens, and some of those tokens could have a high $ value, but be stored in a low BTC value output.
What does the largest possible output script that adheres to standardness rules look like? Is it currently implemented in the functional test? 🔗
key_to_p2pk_script(uncompressed_pubkey)
is on the larger side compared to widely used output scripts, but still far from the largest.- The largest is bare multisig (m-of-n) and specifically m-of-3 as standardness rules only allow up to n=3 and m in m-of-n doesn't matter, as it doesn't change the output script size.
- The current test covers "bare multisig (3-of-3)" but it's not the maximum size as it uses compressed pubkeys. The maximum would be bare "multisig (m-of-3)" with
uncompressed_pubkey
(changed here after this Review Club meeting).
Can you give an example of an output script that is considered standard and is added to the UTXO set (i.e. no null-data), but is still unspendable? Bonus: is there a way to create such an output script where this unspendability can even be mathematically proven? 🔗
- There are "theoretically possible" cases where you can require a signature from a public key that doesn't have an associated private key (maybe if the public key is like 0000000000000...), but you can't mathematically prove that there is no associated private key, therefore prove that it's unspendable.
- Scripts that contain a public key that is not on the curve, are guaranteed to be unspendable. i.e. ones that don't fulfill the secp256k1 y^2 = x^3 + 7 equation. You can't spend those, but you can make them.
- Related interesting discussion on the closed #24106 - treat P2TR outputs with invalid x-only pubkey as non-standard.
- Larry Ruane asked if it would be a good idea to have an RPC to change this dustrelayfee (maybe test-only) in order to avoid node restarts between test cases.
- A similar issue has been opened setminrelayfee#26269 to make a different mempool policy dynamically configurable, for which glozow raised objections.
- There's a whole family of script generation functions such as
key_to_p2pk_script()
that are very interesting to examine as they are very helpful to both the tests and for understanding scripts and output types. This was covered on a previous Review Club meeting #22363 - Usescript_util
helpers for creating P2{PKH,SH,WPKH,WSH} scripts.