Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Firestore online format specification #1367

Merged
merged 1 commit into from
Mar 7, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/specs/firebase_online_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 28 additions & 0 deletions docs/specs/online_store_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,34 @@ Here's an example of how the entire thing looks like:

However, we'll address this issue in future versions of the protocol.

## Cloud Firestore Online Store Format

[Firebase data model](https://firebase.google.com/docs/firestore/data-model) is a hierarchy of documents that can contain (sub)-collections. This structure can be multiple levels deep; documents and subcollections are alternating in this hierarchy.

We use the following structure to store feature data in the Firestore:
* at the first level, there is a collection for each Feast project
* second level, in each project-collection, there is a Firebase document for each Feature Table
* third level, in the document for the Feature Table, there is a subcollection called `values` that contain a document per feature row. That document contains the following fields:
* `key` contains entity key as serialized `feast.types.EntityKey` proto
* `value` contains value as serialized `feast.types.Value` proto
* `event_ts` contains event timestamp (in the native firestore timestamp format)
* `created_ts` contains write timestamp (in the native firestore timestamp format)

Document id for the feature document is computed by hashing entity key using murmurhash3_128 algorithm as follows:
woop marked this conversation as resolved.
Show resolved Hide resolved

1. hash utf8-encoded entity names, sorted in alphanumeric order
2. hash the entity values in the same order as corresponding entity names, by serializing them to bytes as follows:
- binary values are hashed as-is
- string values hashed after serializing them as utf8 string
- int64 and int32 hashed as little-endian byte representation (8 and 4 bytes respectively)
- bool hashed as 0 or 1 byte

Other types of entity keys are not supported in this version of the specification, when using Cloud Firestore.

**Example:**

![Firestore Online Example](firebase_online_example.png)

# Appendix

##### Appendix A. Value proto format.
Expand Down
30 changes: 30 additions & 0 deletions protos/feast/types/EntityKey.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
/*
* Copyright 2018 The Feast Authors
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* https://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

syntax = "proto3";

import "feast/types/Value.proto";

package feast.types;

option java_package = "feast.proto.types";
option java_outer_classname = "EntityKeyProto";
option go_package = "github.com/feast-dev/feast/sdk/go/protos/feast/types";

message EntityKey {
repeated string entity_names = 1;
repeated feast.types.Value entity_values = 2;
}