Skip to content

Commit

Permalink
Handle kfp types (#108)
Browse files Browse the repository at this point in the history
This PR enables the user to pass different `bool`, `lists` and `dict` to
a component. Kubeflow typically handles those arguments by serializing
them as a string . For this reason, they need to be de-serialized again
within the component in order for them to be properly handled.

This might go away once we move to V2. 

References to the issue: 
kubeflow/pipelines#7457
kubeflow/pipelines#7719
  • Loading branch information
PhilippeMoussalli authored May 10, 2023
1 parent e9ce0b0 commit 2b0b071
Show file tree
Hide file tree
Showing 4 changed files with 21 additions and 14 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,10 @@ output_subsets:
args:
dataset_name:
description: Name of dataset on the hub
type: str
type: str
bool_name:
description: Bool a
type: bool
dict_name:
description: List a
type: dict
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
huggingface_hub==0.14.1
git+https://github.com/ml6team/fondant.git@910331f3f2dc09030ed32a6578dcc2107c020590
git+https://github.com/ml6team/fondant.git
pyarrow>=7.0
Pillow==9.4.0
gcsfs==2023.4.0
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@ def load(self, *, dataset_name: str) -> dd.DataFrame:
Returns:
Dataset: HF dataset
"""

# 1) Load data, read as Dask dataframe
logger.info("Loading dataset from the hub...")
dask_df = dd.read_parquet(f"hf://datasets/{dataset_name}")
Expand Down
24 changes: 12 additions & 12 deletions fondant/component_spec.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
"""This module defines classes to represent an Fondant component specification."""
import ast
import copy
import json
import pkgutil
Expand All @@ -15,25 +16,24 @@
from fondant.exceptions import InvalidComponentSpec
from fondant.schema import Field, KubeflowCommandArguments, Type

# TODO: remove after upgrading to kfpv2
kubeflow2python_type = {
"String": str,
"Integer": int,
"Float": float,
"Boolean": ast.literal_eval,
"JsonObject": json.loads,
"JsonArray": json.loads,
}
# TODO: Change after upgrading to kfp v2
# :https://www.kubeflow.org/docs/components/pipelines/v2/data-types/parameters/
python2kubeflow_type = {
"str": "String",
"int": "Integer",
"float": "Float",
"bool": "Boolean",
"dict": "Map",
"list": "List",
"tuple": "List",
"set": "Set",
}

# TODO: remove after upgrading to kfpv2
kubeflow2python_type = {
"String": str,
"Integer": int,
"Float": float,
"Boolean": bool,
"dict": "JsonObject",
"list": "JsonArray",
}


Expand Down

0 comments on commit 2b0b071

Please sign in to comment.