Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement container_layer rule. #279

Merged
merged 14 commits into from
Feb 28, 2018

Conversation

erain
Copy link
Contributor

@erain erain commented Jan 4, 2018

  • each container_layer rule will create its own tar files.
  • container_image rule was refactored so that it can composed by
    multiple container_layers via such new attr.

Tested:

  • new unit tests in image_test.py

Copy link
Contributor

@mattmoor mattmoor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also want to document this in README.md, and probably create {docker,oci}_layer aliases.

testdata/BUILD Outdated
container_image(
name = "files_in_layer_with_files_base",
base = ":files_base",
container_layers = [":files_in_layer"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this kwarg should just be layers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this. :files_base is the base (which is a bunch of file), and layers is the files on top of that. Am I wrong?

@@ -66,6 +66,14 @@ def test_files_with_file_base(self):
self.assertEqual(2, len(img.fs_layers()))
self.assertTopLayerContains(img, ['.', './bar'])

def test_files_with_file_base(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You didn't rename this test, so it collides with the name above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the catch. Done.

@@ -83,6 +91,17 @@ def test_tar_with_tar_base(self):
'./asdf', './usr', './usr/bin',
'./usr/bin/miraclegrow'])

def test_tar_with_tar_base(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing rename

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

_join_path = "join",
)

def magic_path(ctx, f):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we're using this in lang_image anymore. Can you double check, and if I'm right make it private again (leading _) and strip the comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. It is not used in lang_image as you said.

diff_id=diff_id,
env=env)]

_layer_attrs = dict({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this and container_image share a definition for their common attributes? I'd expect it defined here and loaded/used there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, if possible I think I'd like us to change container_image to a macro that creates a container_layer and just assembles it. That may be a more involved change, but I think it definitely results in the best separation of concerns.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't handle this comments yet, since I am still not sure about how to change container_image to a macro.

There are still some attrs which are container_image only, e.g. legacy_run_behavior, labels etc., which are not in container_layer rule. Also how to deal with the backward compatibility of the existing usage of container_image rule?

So how about 1) I re-use all the container_layer attrs and 2) re-use the container_layer impl for the related attrs, but still keep container_image as a rule?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with punting on this for now, but I'd still like the rules to head this general direction to avoid repeating logic. This is the motivation for some of my comments like _image_config.

The idea is that container_image would still be a rule, but likely _container_image, and container_image becomes a macro that wraps container_layer and _container_image.

debs=debs, tars=tars)
# Generate the zipped filesystem layer, and its sha256 (aka blob sum)
zipped_layer, blob_sum = zip_layer(ctx, unzipped_layer)
# Returns constituent parts of the Container layer as provider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth elaborating on why.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added doc explaining why we use provider.

@@ -342,6 +298,7 @@ _attrs = dict({
),
"label_file_strings": attr.string_list(),
"empty_files": attr.string_list(),
"container_layers": attr.label_list(providers = [LayerInfo]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I said elsewhere, I'd prefer just layers here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Emm... this is tricky since layers is also used in lang_image rule, which extends the attrs from container_image rule.

So if we want to keep layers in container_image rule, we need to change the layers in lang_image to something else.

Do you have some suggestions for me?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a problem. lang_image wraps dep_layer and app_layer rules which extend the container_image rule. I don't believe layers is passed through.

zipped_layers = parent_parts.get("zipped_layer", []) + _getLayerInfo(layers, "zipped_layer") + [zipped_layer]
shas = parent_parts.get("blobsum", []) + _getLayerInfo(layers, "blob_sum") + [blob_sum]
unzipped_layers = parent_parts.get("unzipped_layer", []) + _getLayerInfo(layers, "unzipped_layer") + [unzipped_layer]
layer_diff_ids = _getLayerInfo(layers, "diff_id") + [diff_id]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: You are inconsistent about whether this portion is hoisted (like this one) or a nested expression. IMO pick one and keep these lines more consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I didn't quite get the the meaning of "hoisted".

Do you mean I should use all nested expression e.g. list comprehension in this case?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By hoisted I mean:

foo = _getLayerInfo(...)
bar = parent_parts.get(...) + foo

vs:

bar = parent_parts.get(...) + _getLayerInfo(...)


# Get and merge environment variables
env = {}
[env.update(layer[LayerInfo].env) for layer in layers]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work, it doesn't properly evaluate read-modify-write environment variables like PATH.

I believe that for this to work properly, we actually want N calls to _image_config(...).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like _image_config will deal with more things than envvars.

I wrote a skylark function _evalute_merge_env to handle envvar here properly. And I have unit test to cover the read-modify-write case like PATH.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like _image_config will deal with more things than envvars.

That's fine, so just pass env vars :)

I wrote a skylark function _evalute_merge_env to handle envvar here properly

I'd rather not fork this logic across two places. I'd definitely prefer we use _image_config iteratively.

@erain erain force-pushed the feature/container-layer-review branch from 3cc42c5 to 681c385 Compare January 24, 2018 22:26
erain added 8 commits February 8, 2018 16:32
* each container_layer rule will create its own tar files.
* container_image rule was refactored so that it can composed by
multiple container_layers via such new attr.

Tested:
- new unit tests in image_test.py
- fix unit tests'name
- have _evalute_merge_env to properly expand envvar
- magic_path -> _magic_path to make it private
- add comments for provider usage
- create alias docker_layer / oci_layer
- also changes attrs in lang_image to avoid name conflict
* also remove entrypoint / cmd from container_layer rule since they are
not used in the container_image
@erain erain force-pushed the feature/container-layer-review branch from 681c385 to 4597293 Compare February 8, 2018 22:21
@erain
Copy link
Contributor Author

erain commented Feb 8, 2018

Hi Matt,

I have done more refactor of this PR, including:

  • reuse container_layer attrs in container_image rule
  • reuse the implementation of container_layer in container_image rule, and treat the data from container_image as one more layer of layers attr.
  • rename: container_layers -> layers. In order to do this, I rename quite a few attrs in lang_image rules.
  • add documents for container_layer rule.

Could you take another review of this PR?

Many thanks,
Yu

"//container:layers.bzl",
"//skylib:path.bzl",
_get_runfile_path = "runfile",
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used? trying to understand the diff.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I was not familiar with the PR previously and I rebased my change instead of merging master. So this was unrelated change somehow got in.

Fixed right now.

@@ -216,9 +161,25 @@ def _repository_name(ctx):
# the v2 registry specification.
return _join_path(ctx.attr.repository, ctx.label.package)

def _evalute_merge_env(env, new_env):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: evaluAte

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed as this function was deleted.

for k, v in new_env.items():
elems = []
for e in v.split(":"):
if e.startswith("$") and e[1:] in env.keys():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't handle a:$PATH:b or ${PATH}:a:b.

I'd really rather see this go away in favor of _image_config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PATH is also not the environment variable that supports read-modify-write, so splitting on : is also insufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Now use //container:create_image_config to generate config layer by layer so the envvars can be handled correctly.

@@ -286,7 +286,7 @@ def _war_app_layer_impl(ctx):
"""Appends the app layer with all remaining runfiles."""

available = depset()
for jar in ctx.attr.layers:
for jar in ctx.attr.jar_layers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, we do have layers. This rename is better because these are an implementation detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

@mattmoor
Copy link
Contributor

@philwo No CI is running for this repo, what's up?

@philwo
Copy link
Member

philwo commented Feb 27, 2018

@mattmoor I've setup post-/presubmit checks for this repo now. The presubmit for this PR will run when a new commit is pushed to it.

I also sent you an invite to Buildkite - please accept it. You'll need to verify PRs from external contributors to make sure that they're safe to run on CI. :)

Let me know if you have any questions or if it doesn't work.

- remove unnecessary imports in flatten.bzl
- use //container:create_image_config to generate config layer by layer
so the envvars can be handled correctly.
@erain
Copy link
Contributor Author

erain commented Feb 27, 2018

All comments addressed, please help review again.

As for the re-use container_layer rule in container_image:

I'm ok with punting on this for now, but I'd still like the rules to head this general direction to avoid repeating logic.

Yes, currently I just use container_layer in container_image to generate the last layer. I think maybe later I can send other refactor PR to make container_layer more reusable.

Many Thanks!

# Get the config for the base layer
config_file = _get_base_config(ctx)
# Generate the new config layer by layer, using the attributes specified and the diff_id
for i in range(len(layers)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for i, layer in enumerate(layers):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Many thanks for the review and approval!!

@mattmoor mattmoor merged commit 15a6336 into bazelbuild:master Feb 28, 2018
@philwo
Copy link
Member

philwo commented Feb 28, 2018

@philwo No CI is running for this repo, what's up?

Fixed. Let me know if you encounter any issues.

efeller pushed a commit to zenreach/rules_docker that referenced this pull request Apr 17, 2018
* implement container_layer rule.

* add document for container_layer rule
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants