Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can py_runtime use a label cc_binary target as the interpreter? #4286

Closed
qzmfranklin opened this issue Dec 13, 2017 · 22 comments
Closed

Can py_runtime use a label cc_binary target as the interpreter? #4286

qzmfranklin opened this issue Dec 13, 2017 · 22 comments
Assignees
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-Python Native rules for Python type: feature request

Comments

@qzmfranklin
Copy link

Please provide the following information. The more we know about your system and use case, the more easily and likely we can help.

Description of the problem / feature request / question:

I want to

  • build the cpython project with Bazel, and
  • use the Bazel-built target to create my own py_runtime.

I have completely built the python binary target (a cc_binary target defined in a BUILD file) from the cpython source tree using Bazel. Check.
Now I am trying to reference the python binary target from the py_runtime rule.

I want to define py_runtime as the follows:

py_runtime(
    name = 'py-3.7',
    visibility = [
        '//visibility:public',
    ],
    files = glob([
        '//third_party/cpython:runtime',
    ]),
    interpreter = '//third_party/cpython:python',
)

The //third_party/cpython:python is the cc_binary.
The //third_party/cpython:runtime is the standard python runtime library. This is irrelevant to this issue.

If possible, provide a minimal example to reproduce the problem:

The snippet shown above is good enough for this purpose.

Environment info

  • Operating System:
    Ubuntu 16.04

  • Bazel version (output of bazel info release):
    release 0.7.0

  • If bazel info release returns "development version" or "(@non-git)", please tell us what source tree you compiled Bazel from; git commit hash is appreciated (git rev-parse HEAD):

Have you found anything relevant by searching the web?

Not really. I looked at the py_runtime doc (https://docs.bazel.build/versions/master/be/python.html#py_runtime).

Looks like the interpreter argument must point to an actual binary file. But I do not want to check in the python binary, yet. Instead, I want to build the python binary from my source tree.

Anything else, information or logs or outputs that would be helpful?

(If they are large, please upload as attachment or provide link).

@laszlocsomor
Copy link
Contributor

To clarify, are you requesting that py_runtime.interpreter be supporting labels, not just absolute paths?

@laszlocsomor laszlocsomor added category: rules > python P3 We're not considering working on this, but happy to review a PR. (No assignee) type: feature request labels Dec 13, 2017
@laszlocsomor
Copy link
Contributor

/cc @erain

@erain
Copy link
Contributor

erain commented Dec 14, 2017

Hi qzmfranklin,

Haven't worked on py_runtime for a while, and currently it only supports either absolute path or a checked-in python interpreter. I don't think label is supported now.

For a checked in py_runtime, you can refer to my example project:
https://github.com/erain/bazel-python-example

@qzmfranklin
Copy link
Author

@laszlocsomor
Sorry for the late reply.

Per 'support labels': Yes, exactly.

@qzmfranklin
Copy link
Author

Hi @erain ,

I am aware of how to use py_runtime the same way, i.e., the interpreter attribute points to an absolute path.

What I want is to extend the capability of the interpreter attribute of py_runtime to be able to point to a label.

@brandjon brandjon added team-Rules-Python Native rules for Python and removed category: rules > python labels Oct 19, 2018
@brandjon
Copy link
Member

Reviewing this, I'm surprised you're not able to do this already. py_runtime.interpreter_path is a string for the path of a platform-installed runtime, and py_runtime.interpreter is for the label of an in-build runtime, which should be able to be either checked-in or compiled with Bazel.

Can you confirm that this is still a problem, and explain the failure you see?

@qzmfranklin
Copy link
Author

@brandjon

Thanks for checking this issue. I just re-tested using bazel 0.23.1. Here is a brief of the result:

tools/python/BUILD:

py_runtime(
    name = 'py-3',
    files = [],
    interpreter = ':python3', # Error
    visibility = [
        '//visibility:public',
    ],
)

sh_binary(
    name = 'python3',
    srcs = ['python.sh'],
)

tools/python/python.sh:

#!/usr/bin/env python3

Using bazel --python_top=tools/python:py-3 to build and run py_library/py_binary targets yields the following error:

ERROR: /home/zhongming/git/LogiOcean/tools/python/BUILD:20:19: in interpreter attribute of py_runtime rule //tools/python:py-3: '//tools/python:python3' must produce a single file

The only way I can get py_runtime.interpreter to work is by making the string to point to a file checked into Bazel. For example:

py_runtime(
    name = 'py-3',
    files = [],
    interpreter = 'python.sh', # OK
    visibility = [
        '//visibility:public',
    ],
)

@qzmfranklin
Copy link
Author

The way I wanted it to work is that I can compile the python interpreter from within Bazel, and then make py_runtime.interpreter to point to the cc_binary target of the python interpreter.

@brandjon
Copy link
Member

Ok, I understand better now. The issue is that the py_runtime rule operates in a non-standard way for the case where it references an in-workspace interpreter (as opposed to the absolute path of a platform interpreter).

On consultation with @aragos, the way it should work is that it should have interpreter point to the label of an executable target, which can be a sh_binary, cc_binary, etc., or even a checked-in binary that would be a source file from Bazel's POV. It should then invoke the executable file associated with this target, supported by the target's runfiles. (For the case of a checked-in binary, we'd probably have to still use files since it wouldn't have any runfiles that Bazel knows about.)

But what it's actually doing is assuming that interpreter points to a target that produces only a single file, and taking that as the executable. This is wrong for virtually all binary rules and really only works for the case of a checked-in interpreter.

This will require redesigning the py_runtime rule, and possibly splitting it into separate rules for the three cases of a platform runtime (absolute system path), a checked-in binary, and Bazel-built binary. (The latter two would be split because a Bazel-built binary shouldn't need a files attribute.)

In the meantime, I have a horribly hacky proof of concept for a workaround: Define two Starlark adaptor rules that take in an executable target and yield as their files-to-build 1) just the executable and nothing else, and 2) all of the runfiles. Then feed the adapted targets as the interpreter and files of the py_runtime. Example:

pkg/script.sh

#!/bin/bash

###############################################################################
# This is a simulated Python runtime. All it does is load a runfile           #
# (helper.sh) and emit some dummy text. The boilerplate is to support loading #
# runfiles.                                                                   #
###############################################################################

# --- begin runfiles.bash initialization ---
# Copy-pasted from Bazel's Bash runfiles library (tools/bash/runfiles/runfiles.bash).
set -euo pipefail
if [[ ! -d "${RUNFILES_DIR:-/dev/null}" && ! -f "${RUNFILES_MANIFEST_FILE:-/dev/null}" ]]; then
  if [[ -f "$0.runfiles_manifest" ]]; then
    export RUNFILES_MANIFEST_FILE="$0.runfiles_manifest"
  elif [[ -f "$0.runfiles/MANIFEST" ]]; then
    export RUNFILES_MANIFEST_FILE="$0.runfiles/MANIFEST"
  elif [[ -f "$0.runfiles/bazel_tools/tools/bash/runfiles/runfiles.bash" ]]; then
    export RUNFILES_DIR="$0.runfiles"
  fi
fi
if [[ -f "${RUNFILES_DIR:-/dev/null}/bazel_tools/tools/bash/runfiles/runfiles.bash" ]]; then
  source "${RUNFILES_DIR}/bazel_tools/tools/bash/runfiles/runfiles.bash"
elif [[ -f "${RUNFILES_MANIFEST_FILE:-/dev/null}" ]]; then
  source "$(grep -m1 "^bazel_tools/tools/bash/runfiles/runfiles.bash " \
            "$RUNFILES_MANIFEST_FILE" | cut -d ' ' -f 2-)"
else
  echo >&2 "ERROR: cannot find @bazel_tools//tools/bash/runfiles:runfiles.bash"
  exit 1
fi
# --- end runfiles.bash initialization ---

# Runfiles are located using workspace name (default "__main__") and package
# path.
source $(rlocation __main__/pkg/helper.sh)

echo "I am script!"

pkg/helper.sh

echo "I am helper!"

pkg/BUILD

load(":adaptors.bzl", "get_executable", "get_runfiles")

# :helper and :script define a mock Bazel-built Python runtime.

sh_library(
    name = "helper",
    srcs = ["helper.sh"],
)

sh_binary(
    name = "script",
    srcs = ["script.sh"],
    deps = ["@bazel_tools//tools/bash/runfiles", ":helper"],
)

# The adaptors export just the pieces of :script that py_runtime expects
# to see in its `interpreter` and `files` attributes.

get_executable(
    name = "script_executable",
    target = ":script",
)

get_runfiles(
    name = "script_runfiles",
    target = ":script",
)

py_runtime(
    name = "runtime",
    interpreter = ":script_executable",
    files = [":script_runfiles"],
)

py_binary(
    name = "pybin",
    # pybin.py can just be an empty file, it isn't run by the mock runtime.
    srcs = [":pybin.py"],
)

pkg/adaptors.bzl

def _get_executable_impl(ctx):
    return [DefaultInfo(files=depset([ctx.executable.target]))]

get_executable = rule(
    implementation = _get_executable_impl,
    attrs = {"target": attr.label(executable=True, cfg="target")},
)

def _get_runfiles_impl(ctx):
    info = ctx.attr.target[DefaultInfo]
    all_files = depset(transitive=[
        info.default_runfiles.files,
        info.data_runfiles.files,
        info.files,
    ])
    return [DefaultInfo(files=all_files)]

get_runfiles = rule(
    implementation = _get_runfiles_impl,
    attrs = {"target": attr.label(executable=True, cfg="target")},
)

Console

bazel run //pkg:pybin --python_top="//pkg:runtime"
[...]
I am helper!
I am script!

@qzmfranklin
Copy link
Author

@brandjon

Cool! I am very happy this issue finally gets some serious attention.

I think over time, the python-related aspects of Bazel need to be almost completely revamped. This could potentially be as significant as the rework of the cc_toolchain, and maybe reusing some of the platform-related work that is about to be implemented later this year.

For now, I am happy enough without a fix to this issue so long as
a) this issue is being addressed by someone with deep domain expertise in the py_runtime business, and
b) fixing this issue becomes part of the consideration for future python-related work in the long time evolution plan of Bazel.

Close it if you wish. We can always re-open this later.

@qzmfranklin
Copy link
Author

@brandjon

Forgot to say: I completely agree with everything you mentioned in your last comment. What I wrote in the comment above is just stating my stance on it assuming a complete agreement with your statements.

@brandjon
Copy link
Member

@qzmfranklin Indeed we have been revamping Python support in Bazel, mainly the built-in support as it relates to the Python version, sandwiching with Starlark rules, and specifying the interpreter to use. See #7375 for Python-related toolchain work.

We may as well leave this bug open to track the inability to depend on binary rules via interpreter, which was news to me. When we start work on a fix (which likely will come after Bazel 1.0 work is done) we can move this to another bug.

@qzmfranklin
Copy link
Author

@brandjon sgtm

@gnossen
Copy link

gnossen commented Dec 13, 2019

@brandjon Any update on this? I can confirm that this would be a big deal for the gRPC project and Google cloud libraries. Should we view the P3 label as a "not going to get priority" label?

@qzmfranklin
Copy link
Author

I think this is what I wanted to achieve with Bazel + Python:

  • Build the python interpreter and standard library from source.
  • Make building cpython/cython libraries Bazel-idiomatic (no more pip-wheel stuff).
  • Sandwich custom python rules and Bazel-native python rules.

@qzmfranklin
Copy link
Author

Another use case for this is to enable building the fast cpp backend for grpc-python Bazel idiomatic. That is, without needing to resort various repository rules to create external repos such as @local_python_config.

@gergelyfabian
Copy link

I think @kku1993 has implemented something like this with rules_python: https://github.com/kku1993/bazel-hermetic-python, bazelbuild/rules_python#312.

From what I see it was done by compiling python in patch_cmds of http_archive, which seems like a workaround, but still if it works it is better than what we have currently.
Also in the PR for rules_python support is added for specifying a custom python_interpreter_target (that you could also use with a precompiled python interpreter).

@kku1993
Copy link

kku1993 commented May 20, 2020

Thanks @gergelyfabian for pointing me to this thread!

My approach is to build the python interpreter from source using make instead of cc_binary. Incidentally, this produces a single binary that can be used for py_runtime.interpreter (I was not aware of this issue at the time).

I think using cc_binary is definitely the more preferred method for platform independence but as @gergelyfabian pointed out, the make method could be a workaround for now.

In bazelbuild/rules_python#312, my goal is to specify a bazel target as the python interpreter for pip_import rule. As currently implemented, I believe it behaves like py_runtime.interpreter in that it only accepts labels that point to a single file (allow_single_file = True).

@gergelyfabian
Copy link

@kku1993, I tested your solution in bazelbuild/rules_python#312 for providing an interpreter for pip_import. It works great for my case.

@github-actions
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 2.5 years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Mar 16, 2023
@github-actions
Copy link

This issue has been automatically closed due to inactivity. If you're still interested in pursuing this, please reach out to the triage team (@bazelbuild/triage). Thanks!

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Mar 31, 2023
@snakethatlovesstaticlibs

This was closed as a stale issue, but I'm running into it because we need to compile openssl, which python depends on (and we cannot use the system openssl because it is out of date). Since py_runtime can't accept the result of a sh_binary rule, we're having trouble setting the py_runtime to the python + openssl duo that we've compiled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-Rules-Python Native rules for Python type: feature request
Projects
None yet
Development

No branches or pull requests

8 participants