Skip to content

Latest commit

 

History

History
181 lines (130 loc) · 6.47 KB

Changes.md

File metadata and controls

181 lines (130 loc) · 6.47 KB

Change log

0.10.0

  • Remove deprecated APIs
  • Use the serde serialization APIs directly, instead of using the bytecode serializer. Serialization will be about 2x faster
  • Fix bug in SchemaLike::from_value with incorrect strategy deserialization

0.9.1

  • Decimal128 support: serialize / deserialize rust_decimal and bigdecimal objects
  • Add arrow=50 support
  • Improved error messages when deserializing SchemaLike
  • Relax Sized requirement for SchemaLike::from_samples(..), SchemaLike::from_type(..), SchemaLike::from_value(..)
  • Derive Debug, PartialEq for Item and Items

0.9.0

Breaking changes:

  • Make tracing options non-exhaustive
  • Remove the try_parse_dates field in favor of the guess_dates field in TracingOptions (the setter name is not affected)
  • Remove the experimental configuration api

Improvements:

  • Simpler and streamlined API (to_arrow / from_arrow and to_arrow2 / from_arrow2)
  • Add SchemaLike trait to support direct construction of arrow / arrow2 fields
  • Add type based tracing to allow schema tracing without samples (SchemaLike::form_type())
  • Allow to build schema objects from serializable objects, e.g., serde_json::Value (SchemaLike::from_value())
  • Add support for arrow=47, arrow=48, arrow=49
  • Improve error messages in schema tracing
  • Fix bug in arrow2=0.16 support
  • Fix unused warnings without selected arrow versions

Deprecations (see the documentation of deprecated items for how to migrate):

  • Rename serde_arrow::schema::Schema to serde_arrow::schema::SerdeArrowSchema to prevent name clashes with the schema types of arrow and arrow2.
  • Deprecate serialize_into_arrays, deserialize_from_arrays methods in favor of to_arrow / to_arrow2 and from_arrow / from_arrow2
  • Deprecate serialize_into_fields methods in favor of SchemaLike::from_samples
  • Deprecated single item methods in favor of using the Items and Item wrappers

0.8.0

Make bytecode based serialization and deserialization the default

  • Remove state machine serialization, and use bytecode serialization as the default. This change results in a 2.6x speed up for the default configuration
  • Implement deserialization via bytecode (remove state machine implementation)
  • Add deserialization support for arrow

Update arrow version support

  • Add arrow=40, arrow=41, arrow=42, arrow=43,arrow=44, arrow=45, arrow=46 support
  • Remove for arrow=35, arrow=36 support

Improve type support

  • Implement bytecode serialization / deserialization of f16
  • Add support for coercing different numeric types (use TracingOptions::default().coerce_numbers(true))
  • Add support for Timestamp(Milliseconds, None) and Timestamp(Milliseconds, Some("UTC")).

Quality of life features

  • Ignore unknown fields in serialization (Rust -> Arrow)
  • Raise an error if resulting arrays are of unequal length (#78)
  • Add an experimental schema struct under serde_arrow::experimental::Schema that can be easily serialized and deserialized.

No longer export the base module: the implementation details as-is where not really useful. Remove for now and think about a better design.

Bug fixes:

  • Fix bug in bytecode serialization for missing fields (#79)
  • Fix bytecode serialization for nested options, .e.g, Option<Option<T>>.
  • Fix bytecode serialization of structs with missing fields, e.g., missing keys with maps serialized as structs
  • Fix nullable top-level fields in bytecode serialization
  • Fix bug in bytecode serialization for out of order fields (#80)

0.7.1

  • Fix a bug for unions with unknown variants reported here. Now serde_arrow correctly handles unions during serialization, for which not all variants were encountered during tracing. Serializing unknown variants will result in an error. All variants that are seen during tracing are save to use.

0.7

  • Breaking change: add new Item event emitted before list items, tuple items, or map entries

  • Add support for arrow=38 and arrow=39 with the arrow-38 and arrow-39 features

  • Add support for an experimental bytecode serializer that shows speeds of up to 4x. Enable it with

    serde_arrow::experimental::configure(|config| {
        config.serialize_with_bytecode = true;
    });

    This setting is global and used for all calls to serialize_to_array and serialize_to_arrays. At the moment the following features are not supported by the bytecode serializer:

    • nested options (Option<Option<T>>)
    • creating float16 arrays

0.6.1

  • Add support for arrow=37 with the arrow-37 feature

0.6.0

Add support for arrow2

Now both arrow and arrow2 are supported. Use the features to select the relevant version of either crate. E.g., to use serde_arrow with arrow=0.36:

serde_arrow = { version = "0.6", features = ["arrow-36"] }

Deserialization support (arrow2 only)

serde_arrow now supports to deserialize Rust objects from arrays. At the moment this operation is only support for arrow2. Adding support arrow is planned.

More flexible support for Rust / Arrow features

serde_arrow now supports many more Rust and Arrow features.

  • Rust: Struct, Lists, Maps, Enums, Tuples
  • Arrow: Struct, List, Maps, Unions, ...

Removal of custom schema APIs

serde_arrow no longer relies on its own schema object. Now all schema information is retrieved from arrow fields with additional metadata.

More flexible APIs

In addition to the previous API that worked on a sequence of records, serde_arrow now also supports to operate on a sequence of individual items (serialize_into_array, deserialize_form_array) and to operate on single items (ArraysBuilder).

Support for dictionary encoded strings (categories)

serde_arrow supports dictionary encoding for string arrays. This way string arrays are encoded via a lookup table to avoid including repeated string values.

0.5.0

  • Bump arrow to version 16.0.0