Support encoding more dataclass-like things #501

jcrist · 2023-07-31T04:38:56Z

Previously we supported encoding dataclasses (determined as any object with a __dataclass_fields__ attribute), provided those objects were implemented in a way similar-enough to how they were implemented in the standard library. The intent was to support stdlib dataclasses, and if alternative implementations (e.g. pydantic.dataclasses) happened to work then all the better.

However, due to how we were detecting if an object was a dataclass, there was no way to override our builtin support if an alternative implementation (in this case edgedb.Object) didn't work.

To fix this, we now make fewer assumptions about how the backing dataclass object is implemented.

Pros:

We can now natively encode objects implemented using dataclasses, pydantic.dataclasses, and edgedb.Object.
We now only encode fields as declared on the dataclass object. Previously we encoded any attribute lacking a leading underscore, which was efficient and worked well in practice (it's also what orjson does). However, this can lead to weird behavior if some fields intentionally start with an _ (like _id in mongodb) or if the object makes use of functools.cached_property.

Cons:

This flexibility and correctness comes at a performance cost. The fast path is the common case (dataclass uses __dict__, doesn't override __getattribute__), but encoding is now ~20% slower than before. Before we encoded __dict__ based dataclasses 20% faster than orjson; now we're faster for small classes and slower for larger numbers of fields (on my machine 12 fields is the crossover point). For __slots__ based classes we're still around 2x faster than orjson.

Fixes #495.

TODO:

Previously we supported encoding dataclasses (determined as any object with a `__dataclass_fields__` attribute), provided those objects were implemented in a way similar-enough to how they were implemented in the standard library. The intent was to support stdlib dataclasses, and if alternative implementations (e.g. `pydantic.dataclasses`) happened to work then all the better. However, due to how we were detecting if an object was a dataclass, there was no way to override our builtin support if an alternative implementation (in this case `edgedb.Object`) didn't work. To fix this, we now make fewer assumptions about how the backing dataclass object is implemented. Pros: - We can now natively encode objects implemented using `dataclasses`, `pydantic.dataclasses`, and `edgedb.Object`. - We now only encode fields as declared on the dataclass object. Previously we encoded any attribute lacking a leading underscore, which was efficient and worked well in practice (it's also what `orjson` does). However, this can lead to weird behavior if some fields intentionally start with an `_` (like `_id` in mongodb) or if the object makes use of `functools.cached_property`. Cons: - This flexibility and correctness comes at a performance cost. The fast path is the common case (dataclass uses `__dict__`, doesn't override `__getattribute__`), but encoding is now ~20% slower than before. Before we encoded `__dict__` based dataclasses 20% faster than orjson; now we're faster for small classes (<= 8 items, on my machine) and slower for larger numbers of fields. For `__slots__` based classes we're still around 2x faster than `orjson`.

jcrist mentioned this pull request Jul 31, 2023

Support for "non traditional" dataclasses #495

Closed

jcrist force-pushed the refactor-dataclass-encode branch from c3f3fed to a9f3656 Compare August 2, 2023 03:16

jcrist changed the title ~~WIP: Support encoding more dataclass-like things~~ Support encoding more dataclass-like things Aug 2, 2023

jcrist merged commit 5e1d16f into main Aug 2, 2023

jcrist deleted the refactor-dataclass-encode branch August 2, 2023 03:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support encoding more dataclass-like things #501

Support encoding more dataclass-like things #501

jcrist commented Jul 31, 2023 •

edited

Loading

Support encoding more dataclass-like things #501

Support encoding more dataclass-like things #501

Conversation

jcrist commented Jul 31, 2023 • edited Loading

jcrist commented Jul 31, 2023 •

edited

Loading