Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use wtxid for transaction relay #18044
Use wtxid for transaction relay #18044
Changes from all commits
2b4b90a
c7eb6b4
60f0acd
08b3995
85c78d5
144c385
8e68fc2
ac88e2e
2d282e0
46d78d4
97141ca
4eb5155
dd78d1d
9a5392f
8d8099e
cacd852
0e20cfe
0a4f142
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
Large diffs are not rendered by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
d0f03e9
Is this rule a best-effort right now? At tx reception, we may have to request parent based only on the input. If so is the BIP clear enough on the 5) ? Precise also there is no current sanction of the rule by receiver as of today
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you made a good observation -- our fetching of parents of orphan transactions must be by txid, rather than wtxid. I did try to carefully word the BIP to avoid referencing this situation:
So the BIP says that newly announced transactions must be by wtxid. And if your peer announces a transaction by wtxid, you must fetch it by wtxid (and not txid). The BIP is silent on requests for transactions that are not announced -- we could make this explicit but this has never been documented behavior, and I'm not sure we should commit to a certain handling of this kind of thing anyway... But open to suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would amend the BIP to make its scope explicitly restrained. Add to 5) "Note: this rule doesn't cover cases where a peer needs to request transaction for validation purposes but don't know its wtxid yet (e.g missing parents of a wtxid-announced child)" ? At least to avoid someone raising same concern in the future and saying we don't commit on anything wrt to this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
54d58e7
3 new pointers for
boost::multi_index::tag<index_by_wtxid>, mempoolentry_wtxid, SaltedTxidHasher
, not one for newboost::multi_index::hashed_unique
?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ariard I don't understand this comment (1c382b5#r385362270).
I think 3 pointers for an extra index is probably close to correct, but #18086 may make these kinds of heuristics unnecessary, so that's probably a better approach long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this change is necessary (or desirable). I guess it was prompted by this comment: #18038 (comment) , but storing this as a map from txid->wtxid is unnecessary and confusing, since it obfuscates the purpose of this data, which is a set of transactions that may need to be rebroadcast.
The only place that the wtxid is read from this map is in
ReattemptInitialBroadcast()
, at which point we can fetch the transaction from the mempool (we very nearly do that withm_mempool.exists()
), and get the wtxid from there.cc @amitiuttarwar @ariard
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake here, effectively you need to announce transactions from the unbroadcast set by wtxid to upgraded peers but as you point out it doesn't mean we need to cache them as a solution.
Maybe we should precise unbroadcast set semantic wrt same txid, different wtxids, we only guarantee reattempt to the best mempool candidate known for a given txid at the time we call
ReattemptInitialBroadcast
, not insertion (AddUnbroadcastTx
). #18044 (comment) isn't an issue.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really think this "same txid, different wtxid" thing is a complete red herring. The mempool can only ever be aware of one witness for a transaction, so any attempt to announce the transaction via a different wtxid would fail anyway. As Suhas pointed out earlier (#18044 (comment)), to support tracking multiple witnesses we'd need to make significant changes many of our components. Besides, the unbroadcast set is transactions that we created ourselves, so we're not expecting the witnesses to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @jnewbery. There is no real way that we can reasonable start tracking multiple witnesses for the same transaction (either in the mempool or the wallet, ...), so unless this map's storing of wtxid is needed for efficiency, it seems just a set of txid should be sufficient here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😮 wow that's a great point @jnewbery. thank you!!
I thought during implementation I ran into something that prevented me from exclusively having txids on the set, but that evaluation is simply wrong. I've taken a first pass at reverting unbroadcast to a set here: amitiuttarwar@f51cccd. I'll need to revisit with fresh eyes and then will PR.
I agree set makes more sense than a map- its simpler, communicates the intent better, and there's no noticeable efficiency gain with the map. Since we check the transaction is the mempool, the difference between map vs set is the difference between calling
CTxMemPool::get
vsCTxMemPool::exists
. This boils down tomapTx.find(hash)
vsmapTx.count(hash)
, which both have log(n) complexity (according to the boost docs I found).There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amitiuttarwar mapTX is a multiindex and we use a hashed index not a ordered index so it's O(1) lookup. Not sure exactly how mapTX relates to m_unboardcast_txids though...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amitiuttarwar that patch looks correct from a quick skim. I'll review fully once you open a PR.
As @JeremyRubin points out, we're using a
boost::multi_index::hashed_unique
index for the wtxid index, which has constant time search/retrieval.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverted to set in cb79b9d, #19879
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, I think it would be clearer if this was combined with
GetIter()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Add a wtxid-index to the mempool"
I think this would be a nice improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarification: does moving this out of the try block have any behavior change? Is it possible that a semi-corrupt unbroadcast_txids would partially deserialize and then have only some entries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the try block was introduced to specifically catch the error when upgrading around not having an unbroadcast set written to disk. so I think having the initialization & iteration separate actually captures that intent better.
but lmk if you think of any specific ways this logic could be problematic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest: clearing unbroadcast_txids here in case has garbage data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking into this- is there really a way that the CAutoFile
>>
operator could cause a silent/partial failure? as I mentioned previously the idea of this try catch is just for a smooth upgrade. To elaborate further: so we don't printFailed to deserialize mempool data on disk: %s. Continuing anyway."
when everything actually went fine. the current intention is to remove this in 0.22, so if there's a possibility forunbroadcast_txids
to have garbage data written, I'd like to think this through more carefully to find a long term solution.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I believe so? Fortunately for std::map we deserialize an element and then insert it into the map, but is a possibility that we say that there should be 10 elements and there are only 5 elements, or there are 5 elements and their are actually 10. In the event the file is corrupt, we should likely treat it as if the entire thing is corrupt, rather than continuing to process.
Perhaps we should add an exception to throw from Unserialize if there are not the correct amount of entries?
I'm not sure to the degree we need to worry about files on our own local being corrupt, and there are other forms of corruption that can occur. But because it is a behavior change, I'm noting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to be clear the issue only comes up when you don't have enough elements or when you have a half element presently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this correct? Should this transaction get queued somewhere for rebroadcasting? Or is it a known precondition that because it failed to get into the mempool before this line it will fail again?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of why we'd want it in the unbroadcast if it didn't make it back in the mempool, but just double checking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the unbroadcast set should always be a subset of the mempool. it just maintains txids, so we don't have the capability to do much if its not found.
do you mean re-attempt mempool submission? bc we def wouldn't want to broadcast or rebroadcast a transaction we don't have in our mempool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. it's possible that we may want to resubmit it to the mempool if it was previously in our mempool. What changed to cause it to no longer be accepted?