-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorBoard 2.0.2 #2970
Merged
Merged
TensorBoard 2.0.2 #2970
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Summary: This commit adds an RPC definition by which the uploader can connect to the frontend web server at the start of an upload session. This resolves a number of outstanding issues: - The frontend can tell the uploader which backend server to connect to, rather than requiring a hard-coded endpoint in the uploader. - The frontend can tell the uploader how to generate experiment URLs, rather than requiring the backend server to provide this information (which it can’t, really, in general). - The frontend can check whether the uploader client is recent enough and instruct the end user to update if it’s not. - The frontend can warn the user about transient issues in case the service is down, degraded, under maintenance, etc. An endpoint `https://tensorboard.dev/api/uploader` on the server will provide this information. Test Plan: Unit tests suffice. wchargin-branch: uploader-serverinfo-protos
Summary: The new `tensorboard dev list` command prints links to your experiments. This is implemented by repurposing the `StreamExperiments` export RPC, which only includes experiment IDs. We can expand this to additionally show useful metadata: experiment creation time and last-modified time; total number of scalars; counts of runs, tags, or time series; and selected run and tag names could all be useful to include. Test Plan: Ran `tensorboard dev list` on an account with 12 experiments and an account with no experiments, starting from both logged-in and logged-out states. Verified that the printed experiment links resolve correctly. Verified that the normal export flow still works. wchargin-branch: uploader-list
Test Plan: Running against a local frontend server, `tensorboard/2.1.0a0` shows up in the server logs where previously there was `python-requests/2.22.0`. Unit tests also included. wchargin-branch: uploader-user-agent
Summary: This commit integrates the new `ServerInfo` RPC with the uploader. It’s not currently enabled by default: the current behavior is the same as the existing behavior, except that experiment URLs now properly have a trailing slash. We’ll soon remove the hard-coded API backend endpoint behavior to enable this by default. Test Plan: Running a test frontend and a test backend, we observe the following behavior with different arguments: | `--origin` | `--api_endpoint` | → | URL origin | Backend | |------------|------------------|---|------------|---------| | empty | empty | | prod | prod | | empty | prod | | prod | prod | | empty | test | | prod | test | | test | empty | | test | test | | test | test | | test | test | | test | prod | | test | prod | Here, “test” in the `--origin` column is like `http://localhost:8080`, and “test” in the `--api_endpoint` column is like `localhost:10000`. Note that the no-argument case is equivalent to the explicitly-empty argument case because both arguments have empty default values. Explicitly specifying `--origin https://tensorboard.dev`, with any value of `--api_endpoint`, fails with “Corrupt response from backend” because server-side support has not yet been rolled out. This is expected. Specifying `--origin http://localhost:0` or any other unreachable host fails with `ECONNREFUSED` and a nice message. My test frontend is configured to reject clients below version 2.0.0 and warn on clients below version 2.0.1. Changing the local `version.py` to `2.0.0a0` or `2.0.1a0` exercises these cases. Finally, double-checked that building the Pip package, installing it, and running `tensorboard dev list` properly uses the production backend and prints URLs that resolve to the production frontend. wchargin-branch: uploader-serverinfo-request
Summary: This extends the `StreamExperiments` RPC such that the client can specify a set of additional metadata fields that the server should include, like “creation time” or “number of scalar points”. The format is both forward- and backward-compatible. Servers are expected to send responses with both `experiment_ids` and `experiments` until we drop support for clients that do not support `experiments`, at which point they need only send `experiments`. Test Plan: Unit test added to simulate the future behavior of servers. wchargin-branch: streamexperiments-metadata
Summary: We’ve deployed production servers that support the handshake protocol specified in tensorflow#2878 and implemented on the client in tensorflow#2879. This commit enables that protocol by default. Test Plan: Running `bazel run //tensorboard -- dev list` still properly connects and prints valid URLs. Re-running with the TensorBoard version patched to `2.0.0a0` (in `version/version.py`) properly causes a handshake failure. Setting `--origin` to point to a non-prod frontend properly connects to the appropriate backend. Setting `--api_endpoint` to point to a non-prod backend connects directly, skipping the handshake, and printing `https://tensorboard.dev` URLs. Specifying both `--origin` and `--api_endpoint` performs the handshake and overrides the backend server only, printing URLs corresponding to the listed frontend. Running `git grep api.tensorboard.dev` no longer finds any code results. As a double check, building the Pip package and running it in a new virtualenv still has a working `tensorboard dev upload` flow. wchargin-branch: uploader-handshake
Summary: This commit teaches the uploader to display experiment metadata included in `StreamExperiments` responses by supported servers. For servers without this support, the change is a backward-compatible no-op. The format is intentionally undocumented and not under any compatibility guarantees, but is designed to be easily parseable for ad hoc usage. For instance, this simple one-liner finds experiments with lots of points so that the user can delete them: ``` tensorboard dev list | awk '$1 == "Id" { id = $2 } $1 == "Scalars" && $2 > 1000 { print id }' ``` Test Plan: Running against current prod, which does not yet support the new RPCs, the behavior is unchanged: ``` $ bazel run //tensorboard -- dev list https://tensorboard.dev/experiment/IAVF94GPSWWBTvonQe4kgQ/ https://tensorboard.dev/experiment/LiQNYkOHRSGEWj42xtgtjA/ <snip> Total: 12 experiment(s) ``` Running against a local server with support for the new RPCs, we see lots of additional data (tested on both Linux and Windows): ``` $ bazel run //tensorboard -- dev --origin http://localhost:8080 --grpc_creds_type ssl_dev list http://localhost:8080/experiment/WtPawgPIQXi2SZ1fQszOFA/ Id WtPawgPIQXi2SZ1fQszOFA Created 2019-11-25 10:30:18 (23 seconds ago) Updated 2019-11-25 10:30:39 (just now) Scalars 18814 Runs 21 Tags 7 http://localhost:8080/experiment/jD7Qc7l6S8Wy5gWKYTAHOA/ Id jD7Qc7l6S8Wy5gWKYTAHOA Created 2019-11-13 18:32:06 Updated 2019-11-13 18:32:06 Scalars 0 Runs 0 Tags 0 http://localhost:8080/experiment/do8uvvEOSNWOUEANmQIprQ/ Id do8uvvEOSNWOUEANmQIprQ Created 2019-11-13 18:15:25 Updated 2019-11-13 18:15:37 Scalars 3208 Runs 8 Tags 4 <snip> Total: 9 experiment(s) ``` Also tested that the `tensorboard dev export` service still works against both old and new servers. wchargin-branch: uploader-list-metadata
psybuzz
approved these changes
Nov 25, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Cherrypicks:
list
output (uploader: include metadata inlist
output #2941)ServerInfo
from frontend (uploader: requestServerInfo
from frontend #2879)list
subcommand (uploader: add simplelist
subcommand #2903)ServerInfo
protos and logic (uploader: addServerInfo
protos and logic #2878)