Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Unix Timestamp value type #1520

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 6 additions & 2 deletions docs/specs/offline_store_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,16 @@ Here's how Feast types map to Pandas types for Feast APIs that take in or return
| BYTES | `bytes` |
| STRING | `str` , `category`|
| INT32 | `int32`, `uint32` |
| INT64 | `int64`, `uint64`, `datetime64[ns]`, `datetime64[ns, tz]` |
| INT64 | `int64`, `uint64` |
| UNIX_TIMESTAMP | `datetime64[ns]`, `datetime64[ns, tz]` |
| DOUBLE | `float64` |
| FLOAT | `float32` |
| BOOL | `bool`|
| BYTES\_LIST | `list[bytes]` |
| STRING\_LIST | `list[str]`|
| INT32\_LIST | `list[int]`|
| INT64\_LIST | `list[int]`|
| UNIX_TIMESTAMP\_LIST | `list[unix_timestamp]`|
| DOUBLE\_LIST | `list[float]`|
| FLOAT\_LIST | `list[float]`|
| BOOL\_LIST | `list[bool]`|
Expand All @@ -52,7 +54,7 @@ Note that this mapping is non-injective, that is more than one Pandas type may c

Feast array types are mapped to a pandas column with object dtype, that contains a Python array of corresponding type.

Another thing to note is Feast doesn't support timestamp type for entity and feature columns. Values of datetime type in pandas dataframe are converted to int64 if they are found in entity and feature columns.
Another thing to note is Feast doesn't support timestamp type for entity and feature columns. Values of datetime type in pandas dataframe are converted to int64 if they are found in entity and feature columns. In order to easily differentiate int64 to timestamp features, there is a UNIX_TIMESTAMP type that is an int64 under the hood.

#### BigQuery types
Here's how Feast types map to BigQuery types when using BigQuery for offline storage when reading data from BigQuery to the online store:
Expand All @@ -64,13 +66,15 @@ Here's how Feast types map to BigQuery types when using BigQuery for offline sto
| STRING | `STRING` |
| INT32 | `INT64 / INTEGER` |
| INT64 | `INT64 / INTEGER` |
| UNIX_TIMESTAMP | `INT64 / INTEGER` |
| DOUBLE | `FLOAT64 / FLOAT` |
| FLOAT | `FLOAT64 / FLOAT` |
| BOOL | `BOOL`|
| BYTES\_LIST | `ARRAY<BYTES>` |
| STRING\_LIST | `ARRAY<STRING>`|
| INT32\_LIST | `ARRAY<INT64>`|
| INT64\_LIST | `ARRAY<INT64>`|
| UNIX_TIMESTAMP\_LIST | `ARRAY<INT64>`|
| DOUBLE\_LIST | `ARRAY<FLOAT64>`|
| FLOAT\_LIST | `ARRAY<FLOAT64>`|
| BOOL\_LIST | `ARRAY<BOOL>`|
Expand Down
4 changes: 4 additions & 0 deletions docs/specs/online_store_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,13 +107,15 @@ message ValueType {
DOUBLE = 5;
FLOAT = 6;
BOOL = 7;
UNIX_TIMESTAMP = 8;
BYTES_LIST = 11;
STRING_LIST = 12;
INT32_LIST = 13;
INT64_LIST = 14;
DOUBLE_LIST = 15;
FLOAT_LIST = 16;
BOOL_LIST = 17;
UNIX_TIMESTAMP_LIST = 18;
}
}

Expand All @@ -128,13 +130,15 @@ message Value {
double double_val = 5;
float float_val = 6;
bool bool_val = 7;
int64 unix_timestamp_val = 8;
BytesList bytes_list_val = 11;
StringList string_list_val = 12;
Int32List int32_list_val = 13;
Int64List int64_list_val = 14;
DoubleList double_list_val = 15;
FloatList float_list_val = 16;
BoolList bool_list_val = 17;
Int64List unix_timestamp_list_val = 18;
}
}

Expand Down
4 changes: 4 additions & 0 deletions protos/feast/types/Value.proto
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,15 @@ message ValueType {
DOUBLE = 5;
FLOAT = 6;
BOOL = 7;
UNIX_TIMESTAMP = 8;
BYTES_LIST = 11;
STRING_LIST = 12;
INT32_LIST = 13;
INT64_LIST = 14;
DOUBLE_LIST = 15;
FLOAT_LIST = 16;
BOOL_LIST = 17;
UNIX_TIMESTAMP_LIST = 18;
}
}

Expand All @@ -53,13 +55,15 @@ message Value {
double double_val = 5;
float float_val = 6;
bool bool_val = 7;
int64 unix_timestamp_val = 8;
BytesList bytes_list_val = 11;
StringList string_list_val = 12;
Int32List int32_list_val = 13;
Int64List int64_list_val = 14;
DoubleList double_list_val = 15;
FloatList float_list_val = 16;
BoolList bool_list_val = 17;
Int64List unix_timestamp_list_val = 18;
}
}

Expand Down
20 changes: 17 additions & 3 deletions sdk/python/feast/type_map.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,9 +107,9 @@ def python_type_to_feast_value_type(
"uint8": ValueType.INT32,
"int8": ValueType.INT32,
"bool": ValueType.BOOL,
"timedelta": ValueType.INT64,
"datetime64[ns]": ValueType.INT64,
"datetime64[ns, tz]": ValueType.INT64,
"timedelta": ValueType.UNIX_TIMESTAMP,
"datetime64[ns]": ValueType.UNIX_TIMESTAMP,
"datetime64[ns, tz]": ValueType.UNIX_TIMESTAMP,
"category": ValueType.STRING,
}

Expand Down Expand Up @@ -252,6 +252,18 @@ def _python_value_to_proto_value(feast_value_type, value) -> ProtoValue:
)
)

if feast_value_type == ValueType.UNIX_TIMESTAMP_LIST:
woop marked this conversation as resolved.
Show resolved Hide resolved
return ProtoValue(
int64_list_val=Int64List(
val=[
item
if type(item) in [np.int64, np.int32]
else _type_err(item, np.int64)
for item in value
]
)
)

if feast_value_type == ValueType.STRING_LIST:
return ProtoValue(
string_list_val=StringList(
Expand Down Expand Up @@ -296,6 +308,8 @@ def _python_value_to_proto_value(feast_value_type, value) -> ProtoValue:
return ProtoValue(int32_val=int(value))
elif feast_value_type == ValueType.INT64:
return ProtoValue(int64_val=int(value))
elif feast_value_type == ValueType.UNIX_TIMESTAMP:
return ProtoValue(int64_val=int(value))
elif feast_value_type == ValueType.FLOAT:
return ProtoValue(float_val=float(value))
elif feast_value_type == ValueType.DOUBLE:
Expand Down
9 changes: 8 additions & 1 deletion sdk/python/feast/value_type.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,13 +29,15 @@ class ValueType(enum.Enum):
DOUBLE = 5
FLOAT = 6
BOOL = 7
UNIX_TIMESTAMP = 8
BYTES_LIST = 11
STRING_LIST = 12
INT32_LIST = 13
INT64_LIST = 14
DOUBLE_LIST = 15
FLOAT_LIST = 16
BOOL_LIST = 17
UNIX_TIMESTAMP_LIST = 18

def to_tfx_schema_feature_type(self):
if self.value in [
Expand All @@ -49,9 +51,14 @@ def to_tfx_schema_feature_type(self):
ValueType.DOUBLE_LIST.value,
ValueType.FLOAT_LIST.value,
ValueType.BOOL_LIST.value,
ValueType.UNIX_TIMESTAMP_LIST.value,
]:
return schema_pb2.FeatureType.BYTES
elif self.value in [ValueType.INT32.value, ValueType.INT64.value]:
elif self.value in [
ValueType.INT32.value,
ValueType.INT64.value,
ValueType.UNIX_TIMESTAMP.value,
]:
return schema_pb2.FeatureType.INT
elif self.value in [ValueType.DOUBLE.value, ValueType.FLOAT.value]:
return schema_pb2.FeatureType.FLOAT
Expand Down
2 changes: 1 addition & 1 deletion sdk/python/tensorflow_metadata/proto/v0/path_pb2.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

50 changes: 36 additions & 14 deletions sdk/python/tensorflow_metadata/proto/v0/path_pb2.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,45 @@
@generated by mypy-protobuf. Do not edit manually!
isort:skip_file
"""
import builtins
import google.protobuf.descriptor
import google.protobuf.internal.containers
import google.protobuf.message
import typing
import typing_extensions
from google.protobuf.descriptor import (
Descriptor as google___protobuf___descriptor___Descriptor,
FileDescriptor as google___protobuf___descriptor___FileDescriptor,
)

DESCRIPTOR: google.protobuf.descriptor.FileDescriptor = ...
from google.protobuf.internal.containers import (
RepeatedScalarFieldContainer as google___protobuf___internal___containers___RepeatedScalarFieldContainer,
)

class Path(google.protobuf.message.Message):
DESCRIPTOR: google.protobuf.descriptor.Descriptor = ...
STEP_FIELD_NUMBER: builtins.int
step: google.protobuf.internal.containers.RepeatedScalarFieldContainer[typing.Text] = ...
from google.protobuf.message import (
Message as google___protobuf___message___Message,
)

from typing import (
Iterable as typing___Iterable,
Optional as typing___Optional,
Text as typing___Text,
)

from typing_extensions import (
Literal as typing_extensions___Literal,
)


builtin___bool = bool
builtin___bytes = bytes
builtin___float = float
builtin___int = int


DESCRIPTOR: google___protobuf___descriptor___FileDescriptor = ...

class Path(google___protobuf___message___Message):
DESCRIPTOR: google___protobuf___descriptor___Descriptor = ...
step: google___protobuf___internal___containers___RepeatedScalarFieldContainer[typing___Text] = ...

def __init__(self,
*,
step : typing.Optional[typing.Iterable[typing.Text]] = ...,
step : typing___Optional[typing___Iterable[typing___Text]] = None,
) -> None: ...
def ClearField(self, field_name: typing_extensions.Literal[u"step",b"step"]) -> None: ...
global___Path = Path
def ClearField(self, field_name: typing_extensions___Literal[u"step",b"step"]) -> None: ...
type___Path = Path
2 changes: 1 addition & 1 deletion sdk/python/tensorflow_metadata/proto/v0/schema_pb2.py

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading