- Remove deprecated APIs
- Use the serde serialization APIs directly, instead of using the bytecode
serializer. Serialization will be about
2x
faster - Fix bug in
SchemaLike::from_value
with incorrect strategy deserialization
Decimal128
support: serialize / deserializerust_decimal
andbigdecimal
objects- Add
arrow=50
support - Improved error messages when deserializing
SchemaLike
- Relax
Sized
requirement forSchemaLike::from_samples(..)
,SchemaLike::from_type(..)
,SchemaLike::from_value(..)
- Derive
Debug
,PartialEq
forItem
andItems
Breaking changes:
- Make tracing options non-exhaustive
- Remove the
try_parse_dates
field in favor of theguess_dates
field inTracingOptions
(the setter name is not affected) - Remove the experimental configuration api
Improvements:
- Simpler and streamlined API (
to_arrow
/from_arrow
andto_arrow2
/from_arrow2
) - Add
SchemaLike
trait to support direct construction of arrow / arrow2 fields - Add type based tracing to allow schema tracing without samples
(
SchemaLike::form_type()
) - Allow to build schema objects from serializable objects, e.g.,
serde_json::Value
(SchemaLike::from_value()
) - Add support for
arrow=47
,arrow=48
,arrow=49
- Improve error messages in schema tracing
- Fix bug in
arrow2=0.16
support - Fix unused warnings without selected arrow versions
Deprecations (see the documentation of deprecated items for how to migrate):
- Rename
serde_arrow::schema::Schema
toserde_arrow::schema::SerdeArrowSchema
to prevent name clashes with the schema types ofarrow
andarrow2
. - Deprecate
serialize_into_arrays
,deserialize_from_arrays
methods in favor ofto_arrow
/to_arrow2
andfrom_arrow
/from_arrow2
- Deprecate
serialize_into_fields
methods in favor ofSchemaLike::from_samples
- Deprecated single item methods in favor of using the
Items
andItem
wrappers
Make bytecode based serialization and deserialization the default
- Remove state machine serialization, and use bytecode serialization as the default. This change results in a 2.6x speed up for the default configuration
- Implement deserialization via bytecode (remove state machine implementation)
- Add deserialization support for arrow
Update arrow version support
- Add
arrow=40
,arrow=41
,arrow=42
,arrow=43
,arrow=44
,arrow=45
,arrow=46
support - Remove for
arrow=35
,arrow=36
support
Improve type support
- Implement bytecode serialization / deserialization of f16
- Add support for coercing different numeric types (use
TracingOptions::default().coerce_numbers(true)
) - Add support for
Timestamp(Milliseconds, None)
andTimestamp(Milliseconds, Some("UTC"))
.
Quality of life features
- Ignore unknown fields in serialization (Rust -> Arrow)
- Raise an error if resulting arrays are of unequal length (#78)
- Add an experimental schema struct under
serde_arrow::experimental::Schema
that can be easily serialized and deserialized.
No longer export the base
module: the implementation details as-is where not
really useful. Remove for now and think about a better design.
Bug fixes:
- Fix bug in bytecode serialization for missing fields (#79)
- Fix bytecode serialization for nested options, .e.g,
Option<Option<T>>
. - Fix bytecode serialization of structs with missing fields, e.g., missing keys with maps serialized as structs
- Fix nullable top-level fields in bytecode serialization
- Fix bug in bytecode serialization for out of order fields (#80)
- Fix a bug for unions with unknown variants reported here. Now
serde_arrow
correctly handles unions during serialization, for which not all variants were encountered during tracing. Serializing unknown variants will result in an error. All variants that are seen during tracing are save to use.
-
Breaking change: add new
Item
event emitted before list items, tuple items, or map entries -
Add support for
arrow=38
andarrow=39
with thearrow-38
andarrow-39
features -
Add support for an experimental bytecode serializer that shows speeds of up to 4x. Enable it with
serde_arrow::experimental::configure(|config| { config.serialize_with_bytecode = true; });
This setting is global and used for all calls to
serialize_to_array
andserialize_to_arrays
. At the moment the following features are not supported by the bytecode serializer:- nested options (
Option<Option<T>>
) - creating
float16
arrays
- nested options (
- Add support for
arrow=37
with thearrow-37
feature
Now both arrow and arrow2 are supported. Use the features to select the
relevant version of either crate. E.g., to use serde_arrow
with arrow=0.36
:
serde_arrow = { version = "0.6", features = ["arrow-36"] }
serde_arrow
now supports to deserialize Rust objects from arrays. At the
moment this operation is only support for arrow2
. Adding support arrow
is
planned.
serde_arrow
now supports many more Rust and Arrow features.
- Rust: Struct, Lists, Maps, Enums, Tuples
- Arrow: Struct, List, Maps, Unions, ...
serde_arrow
no longer relies on its own schema object. Now all schema
information is retrieved from arrow fields with additional metadata.
In addition to the previous API that worked on a sequence of records,
serde_arrow
now also supports to operate on a sequence of individual items
(serialize_into_array
, deserialize_form_array
) and to operate on single
items (ArraysBuilder
).
serde_arrow
supports dictionary encoding for string arrays. This way string
arrays are encoded via a lookup table to avoid including repeated string values.
- Bump arrow to version 16.0.0