-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Obtaining output paths of the build artifacts #8739
Comments
I'm not sure why this was closed, because it's a great feature. Digging into the bazel source code, I think what needs to be returned are the "important artifacts" for each target. This loop seems to have the logic needed. |
@meisterT This could be an output option for |
It already outputs this information. Example:
More information on aquery is here: Another source for the information for a particular build is the BEP, e.g. running:
|
@meisterT, I'm not sure how to get the The output information doesn't appear to be in the One solution would be to expose the
|
Simon, note that aquery exposes all outputs, but doesn't distinguish between important and other outputs. Joe, please have a look if we can easily expose the "important outputs" from aquery. |
I also agree it would be nice to have an easier way to get the output for a target. I have had the question sometimes from developers new to bazel on how to easily figure out where a built target ends up. And so far my answer has been: Look at the end of the logs of the |
The caveat with a simple command is that the output location might change depending on the flags you pass to the build command. I have ideas to change that but they haven't left brainstorming stage yet (and are rather invasive and incompatible). |
@meisterT, it looks like @joeleba, good luck! |
For an runnable target, we found a workaround using |
Sadly, most of the rules I want the output to are either custom rules, or no runnable. That is a clever hack, though. |
I ended up putting together this script which uses #!/bin/bash -eu
# Prints the path to the (single) output file of the action with the given mnemonic, for the given target.
if [[ $# != 2 ]]; then
echo >&2 "Usage: $0 target mnemonic"
echo >&2 "e.g. $0 //some:target JavaSourceJar"
exit 1
fi
target="$1"
mnemonic="$2" # e.g. GoLink or PythonZipper
out="$(mktemp)"
trap "rm -f ${out}" EXIT
bazel aquery --output=jsonproto "${target}" > "${out}"
outputs_id_query_base=".actions[] | select(.[\"mnemonic\"] == \"${mnemonic}\")| .outputIds"
output_count="$(jq "${outputs_id_query_base} | length" "${out}")"
if [[ "${output_count}" -ne 1 ]]; then
echo >&2 "Saw ${output_count} outputs for the mnemonic and target but expected 1"
exit 1
fi
output_id="$(jq "${outputs_id_query_base}[0]" "${out}")"
jq -r ".artifacts[] | select(.[\"id\"] == ${output_id}) | .execPath" "${out}" |
This is pretty nice, just what I've been looking for, and very tempting to turn into a little Rust or Go tool (for fewer runtime deps...) |
There isn't, each language will have its own primary output Mnemonic. tbh there are probably few enough that you could hard-code them: You could potentially write an aspect to look up the |
A pity, but understandable.
That was my fall-back thought.
Thanks, that sounds worth looking into. |
It is a real pity that such an integral feature for using bazel as part of any real integration has bottomed out on technical trivialities that never should be exposed to users. We have added another instance of --run_under=echo. Please consider simplicity and needs of users integrating into other tools over higher order design points or multi-step solutions. |
I've also recently realised that the correct behaviour is probably to have a final step that "publishes" artefacts to where you want them to be, but haven't had time to get my head around any subtleties in making that work with Bazel's hermetic / side-effect-averse design. |
I've been using BUCK for around 5 years now but recently started using Bazel for personal projects. It came as a suprise to see that this feature, which is extremely convenient when integrating BUCK into other tools and pipelines, is not supported by Bazel. For instance, just now I was generating a fat jar via |
I was also looking for this functionality to integrate bazel artifacts with another build tool and found it helpful to know where various targets I built were going to land. I have created a patch adding a simple I took a look at integrating this with the If there is still interest in this I'm happy to work towards something mergable if the bazel team can provide some guidance on a way forward. |
I think the functionality I wanted is to be able to publish artefacts to a specific non-Bazel path (or hierarchy) rather than reach inside Bazel's own path structures. |
Install steps and/or user defined organization of a build tree are pretty standard, except for in bazel. Personally, the lack of such features (and its general aversion to "integration" concerns) are big parts of the reason why I strongly caution projects against using bazel outside of a corporate monorepo that is tightly controlled. |
I'm specifically just trying to run some bazel managed executables with some flags pointing to bazel built directories. I assumed it would be easier to find all my bazel artifacts than it would be to write rules for packaging them or running them. Perhaps the more bazel-y way is to actually write the run rule you need in bazel (See gazelle if you havent for how this works), and I was trying to avoid this as its a handful of one off, almost but not quite trivial, sh_binaries. I had also been using the |
Coming again from a biased position aligned with how BUCK does this, I think an option "to publish artefacts to a specific non-Bazel path" is more complicated than necessary. BUCK's approach is as simple and as powerful as it gets: it gives you the ability to query where the output generated by a specific rule is, and from there you can basically do whatever you want. Each rule has a well defined API when it comes to inputs and outputs, so the developer querying "the output of a specific rule" knows exactly what that output is and thus can decide to do whatever is needed to fulfill X or Y requirement he or she needs to fulfill. This approach is simple, powerful, and versatile. Going back to my previous example, I know that the fat JAR output of There's no need for Bazel to deploy an output to an external directory. It makes Bazel more complicated than it should be in my opinion since all we need is the location of those artifacts that Bazel has already deployed (even if to a subdirectory under bazel-out). If you then want to deploy those artifacts somewhere you can do that but this doesn't have to be Bazel's responsibility. |
illicitonion's script didn't quite work for me, since Here's what I came up with instead if anyone else finds it useful: #!/usr/bin/env python3
import argparse
import json
import os
import subprocess
parser = argparse.ArgumentParser(
description='Prints the paths to output files of actions with the given mnemonic for the given target.'
)
parser.add_argument("--target", "-t", required=True, help='Bazel target to look up outputs for')
parser.add_argument("--mnemonic", "-m", required=True, help='Bazel mnemonic to look up outputs for')
args = parser.parse_args()
aquery_result = json.loads(subprocess.check_output(["bazel", "aquery", "--output=jsonproto", args.target]))
artifacts = {artifact["id"]: artifact for artifact in aquery_result["artifacts"]}
path_fragments = {fragment["id"]: fragment for fragment in aquery_result["pathFragments"]}
actions = aquery_result["actions"]
def path_from_fragment(path_fragment_id, fragments):
path_fragment = path_fragments[path_fragment_id]
parent = path_fragment.get("parentId")
path = [path_fragment["label"]]
if parent:
return path_from_fragment(int(parent), fragments) + path
else:
return path
for action in actions:
if action["mnemonic"] == args.mnemonic:
for output_id in action["outputIds"]:
artifact = artifacts[output_id]
path_arr = path_from_fragment(artifact["pathFragmentId"], path_fragments)
path = os.path.join(*path_arr)
print(path) I agree with the other commenters saying this is a basic build tool feature that bazel should support |
That's disappointingly complicated.
|
I implemented bazel cquery <target> --output starlark --starlark:file=tools/output.bzl 2>/dev/null tools/output.bzldef format(target):
"""
This is used by `bazel output-query` command together with cquery.
"""
outputs = target.files.to_list()
return outputs[-1].path if len(outputs) > 0 else "(missing)" This only lists the last item the output, but it gets the job done for the most part I think. Works like this: $ bazel output-query tools/tests/src/scala/com/twitter/dummy:hello.bundle
bazel-out/darwin-fastbuild/bin/tools/tests/src/scala/com/twitter/dummy/dist-binary/hello.zip |
@eed3si9n's use of cquery is along the lines of what I have been suggesting. There is a fuller example here: https://github.com/bazelbuild/rules_pkg/tree/main/examples/where_is_my_output |
I submitted #15552 in the hope of settling this issue in a canonical way. Happy to hear your feedback on whether or not it fits your needs. The syntax is:
It does take |
I took a look and it worked well for me - thanks! |
We will give #15552 a proper review. Thanks for all the input above. It's super-helpful. |
Hi all. We've done some review on #15552. I support it and am happy to advocate for making it canonical. But I'd really like your input on what exact formatting is best (I believe the author, @fmeum is also open-minded on formatting): Current output:
Deeper output:
Structured output:
Super-structured output:
What do you all think? I'd highlight that we already have great alternatives for complex use cases. This issue lists several. So I'm partially asking how deep we want the basic interface to go before we defer to the more generalized but advanced Starlark API? |
My two cents, the "deeper" output means users will probably end up using awk or otherwise which is annoying to unpleasant depending on the reader and the "structured" output will result in unreadable jq rules on magic arrays of values that will potentially cause issues if/when the output changes in the future. So I would prefer either simple list of output or "super-structured" with useful keys resulting in readable and robust extraction/transformation rules. And for my case (getting the output path of build targets to run or copy) the targets are already known so I can (and do) just invoke bazel multiple times to get output per build target if i need it. Perhaps the identifier information might be useful to others, but for my case its as unneeded as the target. |
For reference, the equivalent command in Buck is |
which outputs a There's also:
and other output formats that quickly get more dense. So the most
Also, different configs of the same target can produce different outputs. So the label alone isn't sufficient to key the output. So how about that last syntax? Yes, it requires space parsing to get just the paths. But I think that's generally routine for shell UIs. |
Thank you @fmeum ! |
With the new output mode `--output=files`, cquery lists all files advertised by the matched targets in the currently requested output groups. This new mode has the following advantages over `--output=starlark` combined with an appropriate handcrafted `--starlark:expr`: * provides a canonical answer to the very common "Where are my build outputs?" question * is more friendly to new users as it doesn't require knowing about providers and non-BUILD dialect Starlark * takes the value of `--output_groups` into account * stays as close to the logic for build summaries printed by `bazel build` as possible Fixes bazelbuild#8739 RELNOTES: `cquery`'s new output mode [`--output=files`](https://bazel.build/docs/cquery#files-output) lists the output files of the targets matching the query. It takes the current value of `--output_groups` into account. Closes bazelbuild#15552. PiperOrigin-RevId: 462630629 Change-Id: Ic648f22aa160ee57b476180561b444f08799ebb6
With the new output mode `--output=files`, cquery lists all files advertised by the matched targets in the currently requested output groups. This new mode has the following advantages over `--output=starlark` combined with an appropriate handcrafted `--starlark:expr`: * provides a canonical answer to the very common "Where are my build outputs?" question * is more friendly to new users as it doesn't require knowing about providers and non-BUILD dialect Starlark * takes the value of `--output_groups` into account * stays as close to the logic for build summaries printed by `bazel build` as possible Fixes bazelbuild#8739 RELNOTES: `cquery`'s new output mode [`--output=files`](https://bazel.build/docs/cquery#files-output) lists the output files of the targets matching the query. It takes the current value of `--output_groups` into account. Closes bazelbuild#15552. PiperOrigin-RevId: 462630629 Change-Id: Ic648f22aa160ee57b476180561b444f08799ebb6
With the new output mode `--output=files`, cquery lists all files advertised by the matched targets in the currently requested output groups. This new mode has the following advantages over `--output=starlark` combined with an appropriate handcrafted `--starlark:expr`: * provides a canonical answer to the very common "Where are my build outputs?" question * is more friendly to new users as it doesn't require knowing about providers and non-BUILD dialect Starlark * takes the value of `--output_groups` into account * stays as close to the logic for build summaries printed by `bazel build` as possible Fixes #8739 RELNOTES: `cquery`'s new output mode [`--output=files`](https://bazel.build/docs/cquery#files-output) lists the output files of the targets matching the query. It takes the current value of `--output_groups` into account. Closes #15552. PiperOrigin-RevId: 462630629 Change-Id: Ic648f22aa160ee57b476180561b444f08799ebb6
There seem to be no easy way of getting location of the artifacts built by bazel, which results in a lot of unnecessary complexity and potentially faulty assumptions in tooling that needs to access build artifacts.
Most difficulties arise from the fact that
bazel
adds various sub-folders in the output such as "external", "$platform_$arch_[stripped|opt]", "$target_name.runfiles", etc. which requires any tooling that needs to be able to find the location know information about the build target, environment and allbazel
layout conventions.Buck solved this problem by introducing
buck targets --show-output
command that prints location of each build target in the build output folder.For example:
The only potential work-around that I was able to find was using
bazel aquery
as described here but it's not easy to use.It would be nice to have equivalent in bazel, is there anything that prevents us from exposing this information as part of
bazel query
command?Can we have
bazel query mytarget --show-output
or other simple equivalent?The text was updated successfully, but these errors were encountered: