-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot deserialize Uuid #203
Comments
Thanks a lot for the detailed report. That's super helpful. I will have a look and figure out what's going on! |
From a brief look through, I think what might be going on is, trying to get the FieldRef by deserializing from_type figures out what the serde internal data model is for that type then calls deserialize on it to get the default representation of it for serde_arrow to use, but the internal data model for Uuid is String, and so it's deserializing a default string which is actually not valid for a Uuid. Actually, it might be deserializing as a borrowed str, so it's not even calling So I think I can work around this by getting the field types using |
FYI, using One drawback (this may in fact be totally separate and not specific to using
While not a major issue at all, it would be nice if somehow it deserialized into a python UUID object. |
@raj-nimble Thanks for the investigation. From what I understand, your explanation is spot on. I am afraid though there is not much that can be done about it. Re. serializing the type as UUID: it seems there is a canonical extension type for UUID. It would definitely make sense to add some functionality to simplify dealing with UUIDs. It is currently not supported by pyarrow, though. Oh. And polars does not have support for extension types currently, so resulting files would not readable from polars, as far as I understand. Finally, thanks to your report, I also realized that the "human_readable" flag is inconsistently set throughout the crate. I think, it should be false throughout, as I would expect this to result in smaller data files throughout. In your case (UUIDs), this would result the data to be serialized as I think the following changes would make sense:
|
I added a warning on using Thanks a lot for bringing this issue to my attention! |
Thanks @chmp for the follow up. |
Small follow up to UUID issues, if you serialize/deserialize with the
and the result parquet file
|
Sadly, this fails for newtype structs |
Actually it works for newtype structs in a release version. I think the failure with newtype is related to some bug in the enum flattening changes in that MR. |
Added an explanation to the error for IpAddr in #242 |
I was able to work around this in a much better way by adding a custom deserializer that allows for empty strings.
and then
and then in the resulting parquet it is the actual uuid, not the byte array
|
I am having an issue using serde_arrow to extract Fields from structs containing
Uuid
.When I try to create fields from a record containing a Uuid type, the program panics with the following error:
Here is a reproducible example that is extended from the example given in the serde_arrow crates.io page.
The Cargo.toml dependencies I used for this were
My rust version info
The text was updated successfully, but these errors were encountered: