Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve streamable #3031

Merged
merged 13 commits into from
Apr 30, 2021
Merged

Improve streamable #3031

merged 13 commits into from
Apr 30, 2021

Conversation

richardkiss
Copy link
Contributor

No description provided.

@richardkiss richardkiss marked this pull request as draft April 28, 2021 18:18
@rostislav rostislav force-pushed the improve_streamable branch 2 times, most recently from 8ea1b38 to fc159fe Compare April 29, 2021 16:15
Revert an earlier commit and error out on class creation in case a
Streamable subclass is instantiated incorrectly, e.g. containing a
non-serializable member.

Fix cases where Streamable parent class was forgotten.
@rostislav rostislav force-pushed the improve_streamable branch from fc159fe to e350aae Compare April 29, 2021 16:57
@rostislav rostislav marked this pull request as ready for review April 29, 2021 22:58
aqk
aqk previously approved these changes Apr 29, 2021
Copy link
Contributor

@aqk aqk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One thing I don't understand in the first question.


def parse_bool(f):
bool_byte = f.read(1)
assert bool_byte is not None and len(bool_byte) == 1 # Checks for EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will these asserts activate in any run mode, or is it like C, where they can be deactivated in some run modes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assert is removed in python -OO mode, which we don't do. So we shouldn't use assert for flow control. However, this problem was here before this change, so it doesn't make these semantics any worse.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These asserts are important. If we don't have them, people can Deserialize a bytes object which contains extra data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the bool example, read(1) which returns empty would also succeed, so we need to check that the exact size was read, any time we do a read.

bytes_read = f.read(list_size)
assert bytes_read is not None and len(bytes_read) == list_size
return bytes_read
return parse_bytes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is nice.

def coin_record_from_row(self, row: sqlite3.Row) -> WalletCoinRecord:
coin = Coin(bytes32(bytes.fromhex(row[6])), bytes32(bytes.fromhex(row[5])), uint64.from_bytes(row[7]))
return WalletCoinRecord(
coin, uint32(row[1]), uint32(row[2]), bool(row[3]), bool(row[4]), WalletType(row[8]), row[9]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the type of row[9] - I notice it is the only one not explicitly cast

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment with the schema of this table would be very helpful, but I suppose this is not quite in scope for this PR

G1Element.from_bytes(bytes.fromhex(row[1])),
row[3],
row[4],
WalletType(row[3]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WalletType: nice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this broken before?

fields = {}

for _, f_type in fields.items():
parse_functions.append(cls.function_to_parse_one_item(f_type))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intent here to automatically add parse functions for only the primitive types that are contained as members?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general idea is to traverse the __annotations__ tree just once at class instantiation time, and put into a list the sub-parsing functions which do parsing and not introspection.

Copy link
Contributor

@arvidn arvidn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this approach is a lot nicer than the previous one. It especially makes it easy to test the parser for each primitive type, and really cover all the edge cases. I really think we should do that. This code has better work, everything else relies on it.

I would also have expected to see tests where we ensure that some (or maybe all) network messages encode to specific bytes. Just passing a round-trip test isn't enough to ensure we don't break the protocol.

I'll take a look at adding some tests today.

@@ -74,7 +74,7 @@ class NewEndOfSubSlotVDF(Streamable):

@dataclass(frozen=True)
@streamable
class RequestCompactProofOfTime:
class RequestCompactProofOfTime(Streamable):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect the annotation could add the base class too, no? (maybe for later)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annotation could do the dataclass wrapping as well, but the comment in def streamable states that this is intended for linters so they could recognize the methods of the base class as well as the __init__ method provided by dataclass.

def coin_record_from_row(self, row: sqlite3.Row) -> WalletCoinRecord:
coin = Coin(bytes32(bytes.fromhex(row[6])), bytes32(bytes.fromhex(row[5])), uint64.from_bytes(row[7]))
return WalletCoinRecord(
coin, uint32(row[1]), uint32(row[2]), bool(row[3]), bool(row[4]), WalletType(row[8]), row[9]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a comment with the schema of this table would be very helpful, but I suppose this is not quite in scope for this PR

row[0],
bytes.fromhex(row[2]),
uint32(row[0]),
bytes32.fromhex(row[2]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we really store binary data in hex form in the database?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we currently do. See add_derivation_paths() in the same file.

return bytes.decode(str_read_bytes, "utf-8")
raise RuntimeError(f"Type {f_type} does not have parse")
return parse_str
raise NotImplementedError(f"Type {f_type} does not have parse")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this message doesn't look right to me. Shouldn't it say "Type {f_type} does not have from_bytes()"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, parse() is the main way of supporting the streamable protocol

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently from_bytes() is used only for sized integer types. All other types that have custom serialization need to provide their own .parse() function.

@rostislav
Copy link
Contributor

I performed the following benchmark on old and new Streamable to compare their performance: parsing the first 10000 mainnet blocks. The old result: 1.39s; the new result: 1.08s. The speed improvement is 23%.

@@ -76,6 +76,7 @@ class RejectBlock(Streamable):


@dataclass(frozen=True)
@streamable
class RequestBlocks(Streamable):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How did this work before without streamable decorator?


def parse_bool(f):
bool_byte = f.read(1)
assert bool_byte is not None and len(bool_byte) == 1 # Checks for EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These asserts are important. If we don't have them, people can Deserialize a bytes object which contains extra data.


def parse_bool(f):
bool_byte = f.read(1)
assert bool_byte is not None and len(bool_byte) == 1 # Checks for EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the bool example, read(1) which returns empty would also succeed, so we need to check that the exact size was read, any time we do a read.

fields = {}
for _, f_type in fields.items():
values.append(cls.parse_one_item(f_type, f)) # type: ignore
values = [parse_f(f) for parse_f in PARSE_FUNCTIONS_FOR_STREAMABLE_CLASS[cls]]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't remember why I had a try clause here, maybe it was to support Empty streamables? (for example RequestPeers) message. Have we confirmed that those still work?

G1Element.from_bytes(bytes.fromhex(row[1])),
row[3],
row[4],
WalletType(row[3]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this broken before?

@mariano54
Copy link
Contributor

Nice work. If nothing is broken with the wallet, I think we can probable merge it (and if arvids tests pass)

Copy link
Contributor

@mariano54 mariano54 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been running the spend bot with it, no issues

@wjblanke wjblanke merged commit b084813 into main Apr 30, 2021
@wjblanke wjblanke deleted the improve_streamable branch April 30, 2021 17:22
wjblanke added a commit that referenced this pull request Apr 30, 2021
* Added new issue templates (#3033)

* added new issue templates for support and requestes

* spelling is hard...

* Add lock, keep cache consistent (#3051)

* execute task decorator

* use blockchain lock

* indentation

* lint

* execute_task

* Ms.empty blocks3 (#3064)

* Fix empty blocks

* Remove error log

* Missing imports

* .header_hash instead of .prev_hash in simulator

* Fix typo (#3069)

* Use secure coin ids (#3095)

* Add announcements for standard tx (#3023)

* changed bahviour of wallet and wallet tools for standard solutions

* black formatting wallet_tools

* added a test for hypothetical stolen zero_output coin

* swap to sha256 from sha256tree for message

* fix wallet_tools version, address lint issues

* correctly int_from_bytes

* fix uninstantiated key in dict

* Comment out broken test

* Fix types (used SerializedProgram)

Co-authored-by: Mariano <sorgente711@gmail.com>

* cache VDF validation results (per process) (#3110)

* Update GUI for plot refresh issue (#3135)

Co-authored-by: Adam Kelly <aqk@aqk.im>

* remove commit and not needed query (#3148)

* Copyright 2021 Chia Network in LICENSE (#3153)

* just filter limit (#3152)

* fix plot dupe (#3154)

* increase ratio (#3155)

* wait to end of process to avoid zombies (#3160)

* fix takes 1 positional argument but 2 were given (#3202)

* Update CLI tools to use EiB when appropriate. (#3117)

* Improve streamable (#3031)

* Avoid importing `test_constants` as it takes a long time.

* Factor out `parse_*` functions.

* First crack at refactoring `Streamable.parse`.

* Don't add `_parse_functions` attribute to `Streamable`.

This no longer requires an extra `_parse_functions` attribute on a
`Streamable`, as it may be confusing serializers or other functions
that use `__annotations__`.

* Fix lint problems with `black`.

* Fix `parse_tuple`.

* Defer some parsing failures to parse time rather than class-creation time.

* Tidy up & remove some obsolete stuff.

* Decorate `RequestBlocks` as `streamable`.

* Fix wrong uses of Streamable class

Revert an earlier commit and error out on class creation in case a
Streamable subclass is instantiated incorrectly, e.g. containing a
non-serializable member.

Fix cases where Streamable parent class was forgotten.

* Fix wrong types when creating DerivationRecord and WalletCoinRecord

* additional unit tests for streamable parsers

* add type annotations (#3222)

Co-authored-by: Rostislav <rostislav@users.noreply.github.com>
Co-authored-by: arvidn <arvid@libtorrent.org>

* Tests skipping mempool (#3065)

* Avoid importing `test_constants` as it takes a long time.

* Some new tests that circumvents mempool.

* Fix lint problems.

Co-authored-by: J Eckert <sargonas@users.noreply.github.com>
Co-authored-by: Yostra <straya@chia.net>
Co-authored-by: Mariano Sorgente <3069354+mariano54@users.noreply.github.com>
Co-authored-by: Antonio Borrero Granell <me@antoniobg.com>
Co-authored-by: matt-o-how <48453825+matt-o-how@users.noreply.github.com>
Co-authored-by: Mariano <sorgente711@gmail.com>
Co-authored-by: Arvid Norberg <arvid@libtorrent.org>
Co-authored-by: Adam Kelly <338792+aqk@users.noreply.github.com>
Co-authored-by: Adam Kelly <aqk@aqk.im>
Co-authored-by: Kyle Altendorf <sda@fstab.net>
Co-authored-by: Joel <34619326+joelcho@users.noreply.github.com>
Co-authored-by: Gene Hoffman <30377676+hoffmang9@users.noreply.github.com>
Co-authored-by: Richard Kiss <him@richardkiss.com>
Co-authored-by: Rostislav <rostislav@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants