Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miniscript iterators #70

Merged
merged 3 commits into from
Oct 1, 2020
Merged

Miniscript iterators #70

merged 3 commits into from
Oct 1, 2020

Conversation

dr-orlovsky
Copy link
Contributor

I'll start with this sample one for concept (n)ACK and than will work on other types (like public keys)

@apoelstra
Copy link
Member

I think this is useful but dislike the approach - you should be able to do this with space proportional to the depth of the script, rather than the number of keys.

@dr-orlovsky
Copy link
Contributor Author

Hm, not sure I get why it uses space proportional to the number of keys (and I assume by "keys" you mean "nodes" here, not "public keys", iterator for which is not yet implemented)...

As far as I see, the maximum size of the internal stack for MiniscriptIter is proportional to the depth of the script tree multiplied by a branching factor, which is 1...3 (statistically more likely to be 0.5-1.5).

@apoelstra
Copy link
Member

Ah, I think you are right - but I'd still prefer that you get rid of the branching factor.

@dr-orlovsky
Copy link
Contributor Author

Ok, I'll do that. BTW maybe you have at hand some complex Miniscript samples so I can use them in test cases?

@dr-orlovsky
Copy link
Contributor Author

I have refactored it: iterator structure "remembers" (i.e. keeps in stack) only upsteam nodes and their direct children on other (than the current) branches. Don't know how it is possible to optimize its further...

@dr-orlovsky
Copy link
Contributor Author

dr-orlovsky commented Jan 24, 2020

Well, I have made it not to store the stack of the expanded branches, but I am not happy with the code and doubt that reducing memory requirement from ~depth*1.5*(pointer_size) to depth*(pointer_size+usize) worth the increase in code complexity. And it seems that storing 8-byte pointer and 8-byte index value for each step of the tree takes more memory than before...

@apoelstra
Copy link
Member

It is not "1.5", it is the maximum fan-out of any node in the proof. This is not a constant. It is indeed worth reducing the asymptotic space usage.

@dr-orlovsky
Copy link
Contributor Author

Ok, so will this version of iterator work? If yes, I will proceed to the other parts of the code

@dr-orlovsky dr-orlovsky changed the title [WIP] Miniscript iterators Miniscript iterators Jan 27, 2020
@dr-orlovsky
Copy link
Contributor Author

I have finished implementing all functionality I was thinking of for AST iterators, public key/hash traversing and replacement and the code is ready for the review. It does not compile for 1.22, but I will work on that after we will be happy about the whole concept.

@dr-orlovsky
Copy link
Contributor Author

@apoelstra can you pls ack/nack concept? I am already relying on this branch in my own code, so it would be nice to know do you plan to adopt this — or it's better to do it other way

@apoelstra
Copy link
Member

oops - concept ACK the API. will try to review today.

@apoelstra
Copy link
Member

Could you rebase this PR so that it has a sensible history?

@dr-orlovsky
Copy link
Contributor Author

Will do it tomorrow, I have to remember it myself :)

@dr-orlovsky
Copy link
Contributor Author

I've summarized it up to six commits, showcasing the process of feature & docs development so it will be easier to review the design logic step by step. Please let me know if you need more brief version — but it will probably mean just a single commit squashing everything together

@dr-orlovsky
Copy link
Contributor Author

Added unit tests and fixed issues found by tests. Also have done rebase on master

@apoelstra
Copy link
Member

Looks like the use of impl Trait is preventing this from working on 1.22

@dr-orlovsky
Copy link
Contributor Author

Fixed. Build failed on stable, some strange fuzz crash, can't figure out what's that

@dr-orlovsky
Copy link
Contributor Author

Build failed on stable, some strange fuzz crash, can't figure out what's that

Seems the same story as #89

@kiminuo
Copy link
Contributor

kiminuo commented May 26, 2020

I think I have figured out the fuzz crashes:

https://github.com/apoelstra/rust-miniscript/blob/96105bb45c6efc60d0bac41866e4786b2276ed67/src/miniscript/astelem.rs#L366

This match is easy to mess up later and that's what actually one can see in the following code listing:

https://github.com/apoelstra/rust-miniscript/blob/96105bb45c6efc60d0bac41866e4786b2276ed67/src/miniscript/astelem.rs#L431-L432

for thresh and similarly for multi. Meaning there is no check that there are actually any arguments but at least one is expected.

Edit: https://github.com/kiminuo/rust-miniscript/tree/feature/2020-05-multi-thresh-params we will see whether this will fix it.

Edit 2: Ok, so there are more cases than that. Will fix in coming days.

@dr-orlovsky
Copy link
Contributor Author

Now, with the fuzz problem being figured out, I did a rebase, so the thing can be reviewed once again

@dr-orlovsky dr-orlovsky requested a review from apoelstra June 19, 2020 12:58
Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack(modulo the small changes requested) and tested locally. Everything looks good. I left some feedback about the PR.

Looks like the last commit for 1.22 fix created a fuzzy history. With a clean history, I think this is good to go

@dr-orlovsky dr-orlovsky requested a review from sanket1729 July 21, 2020 17:30
@dr-orlovsky
Copy link
Contributor Author

Can you guys re-review the request: all the issues were fixed and it compiles wrll

Copy link
Member

@sanket1729 sanket1729 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @dr-orlovsky, thanks for looking into this. The code looks great and since this is an API change I will also wait for an ACK from @apoelstra about the PR before merging.

tACK e818822.

// Module is public since it export testcase generation which may be used in
// dependent libraries for their own tasts based on Miniscript AST
#[cfg(test)]
pub mod test {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have never seen tests exported publicly. I don't know how idiomatic this is to rust crates. My thinking is that downstream crates must implement their own tests and this should not be a public module. cc @apoelstra .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests are not public, public are fns that generate dumb data. I use them in the downstream dependencies for instance. They are quite common (generation of pubkeys)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine for a project like this, though we'll have to be mindful about changing these functions. I'm not sure if it's a "breaking change" to break an API gated behind #[cfg(test)] but we should at least make an effort not to do it.

sanket1729
sanket1729 previously approved these changes Jul 23, 2020
@sanket1729 sanket1729 self-requested a review July 23, 2020 00:26
@sanket1729 sanket1729 dismissed their stale review July 23, 2020 00:31

misclicked

@dr-orlovsky
Copy link
Contributor Author

Plain rebase onto master unfortunately is quite complex to get working: most of the commits need to be re-written from scratch for that, since master had significantly diverged (new generic type parameter was introduced for instance). So I have to either merge master into this branch again - or just re-implement the iterators from scratch on top of the current master...

@sanket1729
Copy link
Member

sanket1729 commented Jul 30, 2020

Hi @dr-orlovsky , can you try the following:

Make sure your local master is updated to upstream master. Below, ms_iter is the name of local branch that has the code.

sanket1729@mypc:~/rust-miniscript$ git merge-base master ms_iter
e98989c4c1d4140f755dfaf09f0297ba50fc0243
sanket1729@mypc:~/rust-miniscript$ git reset --mixed e98989c4c1d4140f755dfaf09f0297ba50fc0243
Unstaged changes after reset:
M	src/miniscript/mod.rs
sanket1729@mypc:~/rust-miniscript$ git status
On branch ms_iter
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
	modified:   src/miniscript/mod.rs

Untracked files:
  (use "git add <file>..." to include in what will be committed)
	src/miniscript/iter.rs

no changes added to commit (use "git add" and/or "git commit -a")
sanket1729@mypc:~/rust-miniscript$ git diff src/miniscript/mod.rs
diff --git a/src/miniscript/mod.rs b/src/miniscript/mod.rs
index e4a027e..ca275a8 100644
--- a/src/miniscript/mod.rs
+ b/src/miniscript/mod.rs
@@ -36,6 +36,7 @@ pub use self::context::Segwitv0;
 pub mod astelem;
 pub(crate) mod context;
 pub mod decode;
+++ pub mod iter;
 pub mod lex;
 pub mod satisfy;
 pub mod types;
sanket1729@mypc:~/rust-miniscript$ git add src/miniscript/iter.rs

@dr-orlovsky
Copy link
Contributor Author

Done with a single commit: the functionality is quite holistic and there is no reason to split it across multiple commits.

/// then proceeds to the child of the child — down to a leaf of the tree in its first branch.
/// When the leaf is reached, it goes in the reverse direction on the same branch until it
/// founds a first branching node that had more than a single branch and returns it, traversing
/// it with the same algorithm again.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding this comment. Do you just mean that it does an in-order traversal of the entire tree?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is what was meant

| &DupIf(ref node)
| &Verify(ref node)
| &NonZero(ref node)
| &ZeroNotEqual(ref node) => vec![node.deref()],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does .deref() do here? Why not just use *?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using deref below in line 77 to do dereferencing inside an interator (node_vec.iter().map(Arc::deref).collect()), so for consistency I've used it as a function here. Not a problem to correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the commend below, either deref nor * is not required anymore, will push new version

/// To obtain a list of all public keys within AST use [`iter_pk()`] function, for example
/// `miniscript.iter_pubkeys().collect()`.
pub fn get_leaf_pk(&self) -> Vec<Pk> {
match self.node.clone() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we unconditionally clone here even though for most nodes we just throw away the result?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also unclear to me what the value is in making this method be public.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Function is public since it may be useful. It is different from iterator methods: iterators traverse the tree, while this function returns keys only from the current node.


fn next(&mut self) -> Option<Self::Item> {
if self.keys_buff.is_empty() {
self.keys_buff = VecDeque::from(loop {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like all this "compute a Vec, convert to VecDeque, repeatedly pop" logic could also be replaced with a counter and a get_nth_pkh method, eliminating every allocation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored as you've suggested

@dr-orlovsky
Copy link
Contributor Author

dr-orlovsky commented Sep 1, 2020

@apoelstra I made all requested changes and re-factored iterators without allocations. Also have re-based on master

Terminal::PkK(key) => vec![key.to_pubkeyhash()],
Terminal::Multi(_, keys) => keys.iter().map(Pk::to_pubkeyhash).collect(),
match self.node {
Terminal::PkH(ref hash) => vec![hash.clone()],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hashes are Copy, no need for ref or clone

Copy link
Contributor Author

@dr-orlovsky dr-orlovsky Sep 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hashes are, but not MiniscriptKey::Hash. This does not compile, I either have to leave it as is, or add Hash: Copy type in MiniscriptKey trait

path: Vec<(&'a Miniscript<Pk, Ctx>, usize)>,
// Here we store vec of path elements, where each element is a tuple, consisting of:
// 1. Miniscript node on the path
// 2. It's branches stored as a vec (used for avoiding multiple vec allocations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong "It's"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also there is no need to do any Vec allocations here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No allocations now. Also, removed branches fn as you asked.

@dr-orlovsky
Copy link
Contributor Author

dr-orlovsky commented Sep 18, 2020

Travis fails only for 1.22, not 1.31, due to ancient syntax requirements. So I think this can be ignored.

UPD: Fixed; I missed that we are still on 1.22

| (1, Terminal::AndOr(_, ref node, _))
| (2, Terminal::AndOr(_, _, ref node)) => Some(node),

(n, Terminal::Thresh(_, ref node_vec)) => node_vec.get(n).map(Deref::deref),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A minor nit - I think it's more idiomatic (and a bit shorter) to write |x| **x rather than Deref::deref

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed this with latest force-push. But it seems to me |x| &**x looks really stranger than Deref :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the &?

Copy link
Contributor Author

@dr-orlovsky dr-orlovsky Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

B/c it does not compile (at least on 1.22) without: we return Option<&Miniscript>, not just option

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant |x| *x rather than |x| &**x

Copy link
Contributor Author

@dr-orlovsky dr-orlovsky Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It really takes &** to get from &Arc<Miniscript> (which we need to double-deref first with **) to &Miniscript

Copy link
Contributor Author

@dr-orlovsky dr-orlovsky Sep 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you meant |x| *x rather than |x| &**x

Nope, it does not compile that way.

*x = *&Arc = Arc

On the other way, Deref works like a magic even with 1.22: It is directly applied to Arc type even behind &Arc, resulting in &Miniscript

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proof:

error[E0308]: match arms have incompatible types
   --> src/miniscript/iter.rs:81:9
    |
81  | /         match (n, &self.node) {
82  | |             (0, &Terminal::Alt(ref node))
83  | |             | (0, &Terminal::Swap(ref node))
84  | |             | (0, &Terminal::Check(ref node))
...   |
107 | |             _ => None,
108 | |         }
    | |_________^ expected reference, found struct `std::sync::Arc`
    |
    = note: expected type `std::option::Option<&miniscript::Miniscript<Pk, Ctx>>`
               found type `std::option::Option<std::sync::Arc<miniscript::Miniscript<Pk, Ctx>>>`
note: match arm with an incompatible type
   --> src/miniscript/iter.rs:105:56
    |
105 |             (n, &Terminal::Thresh(_, ref node_vec)) => node_vec.get(n).map(|x| *x),
    |                                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot we had Arcs in there.

Copy link
Member

@apoelstra apoelstra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack b836032

Thanks for the many iterations!

@apoelstra apoelstra merged commit cda6b5e into rust-bitcoin:master Oct 1, 2020
@sanket1729 sanket1729 mentioned this pull request Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants