Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDK/Client - Supporting pipeline packages with multiple files #1207

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 17 additions & 18 deletions sdk/python/kfp/_client.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,29 +155,28 @@ def get_experiment(self, experiment_id=None, experiment_name=None):
raise ValueError('No experiment is found with name {}.'.format(experiment_name))

def _extract_pipeline_yaml(self, package_file):
def _choose_pipeline_yaml_file(file_list) -> str:
yaml_files = [file for file in file_list if file.endswith('.yaml')]
if len(yaml_files) == 0:
raise ValueError('Invalid package. Missing pipeline yaml file in the package.')

if 'pipeline.yaml' in yaml_files:
return 'pipeline.yaml'
else:
if len(yaml_files) == 1:
return yaml_files[0]
raise ValueError('Invalid package. There is no pipeline.yaml file and there are multiple yaml files.')
Copy link

@charlesa101 charlesa101 May 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if the error message raised here is what you are trying to express

so instead of 'Invalid package. There is no pipeline.yaml file and there are multiple yaml files.'

shouldn't it be 'Invalid package. There are multiple yaml files.'

Copy link
Contributor Author

@Ark-kun Ark-kun May 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. The whole goal of this PR is to support packages containing multiple .yaml files. In that case one of them must be called pipeline.yaml.


if package_file.endswith('.tar.gz') or package_file.endswith('.tgz'):
with tarfile.open(package_file, "r:gz") as tar:
all_yaml_files = [m for m in tar if m.isfile() and
(os.path.splitext(m.name)[-1] == '.yaml' or os.path.splitext(m.name)[-1] == '.yml')]
if len(all_yaml_files) == 0:
raise ValueError('Invalid package. Missing pipeline yaml file in the package.')

if len(all_yaml_files) > 1:
raise ValueError('Invalid package. Multiple yaml files in the package.')

with tar.extractfile(all_yaml_files[0]) as f:
file_names = [member.name for member in tar if member.isfile()]
pipeline_yaml_file = _choose_pipeline_yaml_file(file_names)
with tar.extractfile(tar.getmember(pipeline_yaml_file)) as f:
return yaml.safe_load(f)
elif package_file.endswith('.zip'):
with zipfile.ZipFile(package_file, 'r') as zip:
all_yaml_files = [m for m in zip.namelist() if
(os.path.splitext(m)[-1] == '.yaml' or os.path.splitext(m)[-1] == '.yml')]
if len(all_yaml_files) == 0:
raise ValueError('Invalid package. Missing pipeline yaml file in the package.')

if len(all_yaml_files) > 1:
raise ValueError('Invalid package. Multiple yaml files in the package.')

with zip.open(all_yaml_files[0]) as f:
pipeline_yaml_file = _choose_pipeline_yaml_file(zip.namelist())
with zip.open(pipeline_yaml_file) as f:
return yaml.safe_load(f)
elif package_file.endswith('.yaml') or package_file.endswith('.yml'):
with open(package_file, 'r') as f:
Expand Down