TensorBoard 2.2.2 #3652

caisq · 2020-05-19T19:20:45Z

Prepare for release 2.2.2, featuring the Histograms support in tensorboard.dev.

Cherry-pick command: git cherry-pick 0f36c3fe..1b877f76

…3510) Summary: We initially used `dataclass_compat` to perform filtering of large graphs as a stopgap mechanism. This commit moves that filtering into the uploader, which is the only surface in which it’s actually used. As a result, `dataclass_compat` no longer takes extra arguments and so can be moved into `EventFileLoader` in a future change. Test Plan: Unit tests added to the uploader for the small graph, large graph, and corrupt graph cases. wchargin-branch: uploader-graph-filtering

)

* Update description and landing page video Adding Dev Summit 2020 video. Feel free to edit. * Update _index.yaml * Update _index.yaml * Update _index.yaml * review: fix long line in _index.yaml Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>

…low#3498) Note that this is the ng_tensorboard version of tensorflow#3483. * Motivation for features / changes Reduce the load on TensorBoard servers by disabling auto reload if the page is not considered visible. * Technical description of changes Do not auto reload if document is not visible according to the Page Visibility API. When page again becomes visible perform an auto reload if one has been missed.

tensorflow#3506) * Motivation for features / changes * Continue developing DebuggerV2 plugin, specifically the part that focuses on intra-graph execution. * Technical description of changes * debugger_types: add `graphExecutions` field to `DebuggerState` * tfdbg2_data_source.ts: add `fetchGraphExecutionDigests()` and `fetchGraphExecutionData()` * debugger_actions.ts: add action for requesting # of graph executions and action that signifies the successful loading of the # of graph executions * debugger_reducers.ts: add reducers for the said actions and states * views/graph_executions folder: add component and container for visualizing graph executions. Currently only the number of graph executions is shown. Follow-up PRs will add a virtual scroll element that shows the graph execution digests. * Update the CSS and layout in the top-section of DebuggerV2 Plugin to accommodate the new `GraphExecutionContainer`. * Screenshots of UI changes * ![image](https://user-images.githubusercontent.com/16824702/79076659-ecb86680-7cc9-11ea-895b-2a4fa6089ad2.png) * Detailed steps to verify changes work correctly (as executed by you) * Unit tests added * Running the DebuggerV2 plugin against a logdir with real tfdbg2 dump data

Explain that TensorBoard.dev now supports graphs dashboards. Provide a link to an experiment with both scalars and graphs.

Summary: The `data_compat` and `dataclass_compat` transforms are now effected by the `EventFileLoader` class. The event accumulator and uploader therefore no longer need to effect these transformations manually. Test Plan: Unit tests that use mocking have been updated to apply compat transforms manually. Using the uploader with `--plugin scalars,graphs` still works as intended. wchargin-branch: efl-compat-transforms

Summary: The `dataclass_compat` transform dispatches on plugin name, which is stored in the summary metadata. But the semantics of summary metadata is that it need only be set in the first event of each time series. Thus, until now, events past the first step of a time series have not been modified by `dataclass_compat`. This has been fine because all transformations have been either (a) only metadata transforms (which don’t matter after the first step anyway) or (b) tensor transforms to hparams data (which only ever has one step anyway). But an upcoming change will need to transform every step of each audio time series, so we need to identify which summaries have audio data even when they don’t have plugin name set on the same event. To do so, we let `dataclass_compat` update a stateful map from tag name to the first `SummaryMetadata` seen for that time series (scoped to a particular run). Test Plan: Some unit tests included now; the tests for migration of audio summaries will be more complete because they have more interesting things to test. wchargin-branch: dataclass-compat-initial-metadata

Summary: Most versions of the audio summary APIs only support writing audio data, but the TB 1.x `tensorboard.summary.audio` ops also support attaching a “label” (Markdown text) to each audio clip. This feature is not present in any version of the TensorFlow summary APIs, including `tf.contrib`, and is not present in the TB 2.x `tensorboard.summary.audio` API. It hasn’t been widely adopted, and doesn’t integrate nicely in its current form with upcoming efforts like migrating the audio dashboard to use the generic data APIs. This commit causes labels to no longer be read by TensorBoard. Labels are still written by the TB 1.x summary ops if specified, but now print a warning. Because we do not change the write paths, it is important that `audio.metadata` *not* set `DATA_CLASS_BLOB_SEQUENCE`, because otherwise audio summaries will not be touched by `dataclass_compat`. Tests for `dataclass_compat` cover this. We don’t think that this feature has wide adoption. If this change gets significant pushback, we can look into restoring labels with a different implementation, likely as a parallel tensor time series. Closes tensorflow#3513. Test Plan: Unit tests updated. As a manual test, TensorBoard still works on both legacy audio data and audio data in the new form (that’s not actually written to disk yet), and simply does not display any labels. wchargin-branch: audio-no-labels

Summary: This patch teaches the audio plugin to support generic data providers. This requires a few more `list_blob_sequences` queries than we’d like, due to the current structure of the frontend. Some have been avoided by pushing the content type of the waveform into the query string. A bit of care is required to ensure that a malicious request cannot turn this into an XSS hole; we do so by simply whitelisting a (singleton) set of audio content types, which means that the safety properties are unchanged by this patch. Test Plan: The audio dashboard now works even if the multiplexer is removed. The audio dashboard doesn’t have any “extras” (data download links, etc.), so just checking the basic functionality suffices. Unit tests updated. wchargin-branch: audio-generic

@nfelt

Summary: This patch adds a decorator and context manager for latency profiling. Any file can just add `@timing.log_latency` to a function definition or introduce a `with timing.log_latency("some_name_here"):` context anywhere. The output structure reveals the nesting of the call graph. The implementation is thread-safe, and prints the current thread’s name and ID so that the logs can easily be disentangled. (Basically, just find a line that you’re interested in, note the thread ID, and pass the log through `grep THREAD_ID` to get a coherent picture.) I’ve added this as a global build dep on `//tensorboard` so that it’s easy to patch timing in for debugging without worrying about build deps. Any `log_latency` uses that are actually committed should be accompanied by a strict local dep. This patch combines previous independent drafts by @nfelt and @wchargin. Test Plan: Simple API usage tests included. As a manual test, adding latency decoration to a few places in the text plugin’s Markdown rendering pipeline generates output like this: ``` I0417 13:31:01.586319 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER markdown_to_safe_html I0417 13:31:01.586349 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER Markdown.convert I0417 13:31:01.586455 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE Markdown.convert - 0.000105s elapsed I0417 13:31:01.586492 140068322420480 timing.py:118] Thread-2[7f64329a2700] ENTER bleach.clean I0417 13:31:01.586783 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE bleach.clean - 0.000288s elapsed I0417 13:31:01.586838 140068322420480 timing.py:118] Thread-2[7f64329a2700] LEAVE markdown_to_safe_html - 0.000520s elapsed ``` This is only emitted when `--verbosity 0` is passed, not by default. Co-authored-by: Nick Felt <nfelt@users.noreply.github.com> wchargin-branch: timing-log-latency

Summary: By design, we only expose an API for converting Markdown to *safe* HTML. But clients who call this method many times end up performing HTML sanitization many times, which is expensive: about an order of magnitude more expensive than Markdown conversion itself. This patch introduces a new API that still only emits safe HTML but enables clients to combine multiple input documents with only one round of sanitization. Test Plan: The `/data/plugin/text/text` route sees 40–60% speedup: on my machine, - from 0.38 ± 0.04 seconds to 0.211 ± 0.005 seconds on the “higher order tensors” text demo downsampled to 10 steps; and - from 5.3 ± 0.9 seconds to 2.1 ± 0.2 seconds on a Google-internal dataset with 32 Markdown cells per step downsampled to 100 steps. wchargin-branch: text-batch-bleach

…orflow#3531)

* Add projector plugin colab * Adds a colab demo demonstrating how to use the projector plugin locally. * Add embedding_projector image for colab * Add colab to _book.yaml * Add analysis to projector plugin * Adding images for embedding projector colab * Rename Projector plugin to Embedding projector

Bazel is now out of the beta mode (version 0.x). We want to depend on 1+ for better long term support in the future.

* Motivation for features / changes * Continue developing DebuggerV2 plugin, specifically its GraphExecutionContainer that visualizes the details of the intra-graph execution events. * Technical description of changes * Add necessary actions, selectors, reducers and effects to support the lazy, paged loading of `GraphExecution`s * Add a `cdk-virtual-scroll-viewport` to `GraphExecutionComponent` * The lazy loading is triggered by scrolling events on the `cdk-virtual-scroll-viewport` * The displaying detailed debug-tensor summaries such as dtype, rank, shape, and numeric breakdown will be added in the follow PRs. This PR just adds displaying of the tensor name and op type in the `cdk-virtual-scroll-viewport`. * Screenshots of UI changes * Loaded state: ![image](https://user-images.githubusercontent.com/16824702/79614243-24f6e500-80ce-11ea-8b9f-412cb831a449.png) * Loading state (mat-spinner to be added in follow-up CLs): ![image](https://user-images.githubusercontent.com/16824702/79614131-e19c7680-80cd-11ea-8dad-2dfd4b998ada.png) * Detailed steps to verify changes work correctly (as executed by you) * Unit tests added * Manual testing against logdirs with real tfdbg2 data of different sizes

Internal application would like to use the initialState for other purposes. Expose it as a public API.

* Bump Bazel minimum version to 2.1.0 to match CI * Remove obsolete .bazelrc settings

Notable changes: Angular 8 -> Angular 9 zonejs upgrade ngrx @ 9 TypeScript @ 3.8.3 Karma @ 5 Angular Ivy compiler (ngcc) Updated rules_nodejs to make it work with Ivy compiler use node 11 in travis We used 1 as inspiration to our changes. Other notable changes: window.require is frozen object after require.js upgrade

) Identify tests in uploader_test.py that specifically target logic in _ScalarBatchedRequestSender and rewrite them to only use _ScalarBatchedRequestSender and none of the other layers in uploader.py. This is a useful exercise for the work on _TensorBatchedRequestSender. I was having a difficult time using _ScalarBatchedRequestSender tests to establish a pattern for testing _TensorBatchedRequestSender.

We identified few several expensive operations going on underneath Plottable. - When we modified dataset() attached to a plot, Plottable recalculated domain which in turn triggered redraw of the plot. - Plot redraw is scheduled but repeated `clearTimeout` and `setTimeout` take considerable time which was the large part of the dataset update - Changing a dataset caused us to updateSpecialDataset N times (N = number of runs) We addressed the issue by: - `commit` API after making all the changes - remember which runs/dataset were changed - update datasets only once per commit We did not address: - autoDomain recomputing the bounds on every data changes. - there is no programmatical way to disable auto domain unless we change the domain which causes, for instance, zoom level to reset Empirically, the render time: - toggling run on the selector went down from 1740ms to 273ms - triggering the log scale went down from 315ms to 264ms - from ~670ms to ~620ms when measured from mouse up Do note that this change does not improve time for other charting operation like zoom, smoothing changes, and etc...

Remove standalone uploader binary target that shouldn't be used any longer. No longer make //tensorboard/uploader a binary target. Rename uploader_main to uploader_subcommand. Clean up some BUILD target names by removing "_lib" suffix.

Summary: TensorBoard typically reads from a logdir on disk, but the `--db_import` and `--db` flags offered an experimental SQLite-backed read path. This DB mode has always been clearly marked as experimental, and does not work as advertised. For example, runs were broken in all dashboards. A data set with *n* runs at top level would show as *n* runs in TensorBoard, but all of them would have the name `.` and the same color, and only one of them would actually render data. The images dashboard was also completely broken and did not show images. And lots of plugins just didn’t support DB mode at all. Furthermore, the existence of DB mode is an active impediment to maintenance, and also accounts for some of the few remaining test cases that require TensorFlow 1.x to run. This patch removes support for DB mode from all plugins, removes the `db_connection_provider` field from `TBContext`, and renders the `--db` and `--db_import` flags functionally inoperative. The plumbing itself will be removed in a separate commit; a modicum of care is required for the `TensorBoardInfo` struct. Test Plan: Running with a standard demo logdir, TensorBoard still works fine with both `--generic_data false` and `--generic_data true`. wchargin-branch: nodb-functionality

The future _TensorBatchedRequestSender behaves similarly to the existing _ScalarBatchedRequestSender. This change refactors some of the less-trivial logic out from _ScalarBatchedRequestSender to be reused in _TensorBatchedRequestSender. Refactor logic for managing request byte budget into its own class and for pruning empty runs and tags into its own function. One test is improved to use mocked _ByteBudgetManager.

When tooltipColumns become shorter after a change, we currently do not remove the DOM. This can lead to an awkward UX where a column in the tooltip never updates and go stale. This change removes the tooltip row column DOM when they are no longer shown. FYI: the tooltip header was properly removed before.

…ps (tensorflow#3544) * Motivation for features / changes * This is a follow-up / clean-up task for adding Const ops to the set of instrumented ops by tfdbg2 (b/153722488) * Technical description of changes * In debugger_v2_plugin_test.py, several assertions related to the exact set of ops executed inside tf.functions (Graphs) are updated to reflect the Const ops that weren't instrumented previously but are now. * Disabling tags for the task are removed from the BUILD file.

…ecution/data routes (tensorflow#3526) * Motivation for features / changes * Fix a performance issue in the HTTP route /graph_execution/data * The issue has to do with the fact that `DebugDataMultiplexer.GraphExecutionData()` uses the inefficient approach of reading all the `GraphExecutionTrace`s first and then slice the array. A much more efficient approach (analogous to the `DebugDataMultiplexer.ExecutionData()` for top-level executions) is to use the range reading support of `DebugDatataReader.graph_execution_traces()` that is now available. * Similarly, use the range reading support of `DebugDatataReader.execution()` in the `/execution/data` route. * Technical description of changes * See the description above. Use range reading support of `DebugDatataReader.execution()` and `DebugDatataReader.graph_execution_trace()`. * Also, remove an extraneous element in the json response (`num_digests`). It is not in the design spec and was included due to oversight on my part. * Detailed steps to verify changes work correctly (as executed by you) * No new unit tests are required to verify the correctness of the new implementation. The existing ones suffice. * Manual testing in the web GUI to confirm much more efficient loading.

…-execution components (tensorflow#3541) * Motivation for features / changes * Let the relatively new `GraphExecutionComponent` display details information about debugger-instrumented tensors, such as dtype, rank, size, shape, breakdown of numerical values by type (inf, nan, zero, etc.), and whether inf/nan exists. * In the older `ExecutionDataComponent` for eager tensors, replace the old Angular code that serves a similar purpose with the same Component as used by `GraphExecutionComponent`. This improves unity and reduces maintenance overhead going forward. * Technical description of changes * Add type `DebugTensorValue` to store/debugger_types.ts. The new interface defines the possible types of data available from various `TensorDebugMode`s (an existing enum). * Add helper function `parseDebugTensorValue()` in a new file store/debug_tensor_value.ts to parse an array representation of TensorDebugMode-specific debugger-generated tensor data into a `DebugTensorValue`. * Add `DebugTensorValueComponent` and its various subcomponents in new folder views/debug-_tensor_value/ * Use `DebugTensorValueComponent` from `GraphExecutionComponent` and `ExecutionDataComponent`. * Screenshots of UI changes * CURT_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056423-31c27100-84f2-11ea-91db-b441f94d11c8.png) * CONCISE_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056486-5a4a6b00-84f2-11ea-8142-6c5e282f06c5.png) * SHAPE mode: ![image](https://user-images.githubusercontent.com/16824702/80056523-72ba8580-84f2-11ea-9126-2c27c9b922b1.png) * FULL_HEALTH mode: ![image](https://user-images.githubusercontent.com/16824702/80056591-94b40800-84f2-11ea-9592-fd61c53a2316.png) * Detailed steps to verify changes work correctly (as executed by you) * Unit tests are added for the new helper method and new component and its subcomponents * Manual verification against logdirs with real tfdbg2 data.

…tensorflow#3550) * Motivation for features / changes * Make it clearer in the StackFrameComponent which frame (if any) is being highlighted in the SourceCodeComponent. * Technical description of changes * In StackFrameContainer's internal selector, add a boolean field to the return value called `focused`. It is used in the template ng html to apply a CSS class (`.focused-stack-frame`) to the focused stack frame and only the focused stack frame. * While adding new unit test for the new feature, simplify existing ones by using `overrideSelector`. * To conform to style guide, replace `toEqual()` with `toBe()` where applicable in the touched unit tests. * Screenshots of UI changes * ![image](https://user-images.githubusercontent.com/16824702/80288736-ed102300-8707-11ea-9638-7aa197a23ff6.png) * Detailed steps to verify changes work correctly (as executed by you) * Unit test added * Existing unit tests for `StackFrameContainer` are simplified by replacing `setState()` with `overrideSelector()`.

davidsoergel · 2020-05-27T17:29:29Z

RELEASE.md

+    available, so that it can be rendered via the Histograms plugin on
+    TensorBoard.dev
+- May support additional plugins in the future, once those plugins are enabled
+  in the tensorboard.dev service.


Capitalization: TensorBoard.dev

googlebot · 2020-05-27T17:29:37Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

davidsoergel · 2020-05-27T17:31:15Z

RELEASE.md

+  - The `tensorboard dev upload` subcommand now sends the histograms, when
+    available, so that it can be rendered via the Histograms plugin on
+    TensorBoard.dev
+- May support additional plugins in the future, once those plugins are enabled


Since this section is ambiguous re uploader vs. server-side things, maybe be explicit: "This version of the uploader may support..."

googlebot · 2020-05-27T17:33:55Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

googlebot · 2020-05-27T18:13:45Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

googlebot · 2020-05-27T19:25:32Z

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

googlebot · 2020-05-27T19:26:19Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

wchargin and others added 30 commits May 19, 2020 15:01

upgrade handlebars 4.7.3 -> 4.7.6 to avoid optimist dep (tensorflow#3517

de9f5e2

)

Update README with TensorBoard.dev graph info (tensorflow#3523)

49bc91f

Explain that TensorBoard.dev now supports graphs dashboards. Provide a link to an experiment with both scalars and graphs.

compat: avoid tf.__version__ in tf2 API lazy loader (tensorflow#3525)

94d35cd

Add the missing route-prefix property to <vz-projector-app> (tens…

fe86e50

…orflow#3531)

Upgrade min Bazel required and upgrade CI to use 2.1 (tensorflow#3239)

887c5e8

Bazel is now out of the beta mode (version 0.x). We want to depend on 1+ for better long term support in the future.

ng: expose core initialState as public (tensorflow#3533)

4e129b2

Internal application would like to use the initialState for other purposes. Expose it as a public API.

Bump Bazel minimum version to 2.1.0 to match CI (tensorflow#3535)

0ea6284

* Bump Bazel minimum version to 2.1.0 to match CI * Remove obsolete .bazelrc settings

caisq force-pushed the 2.2 branch from e09d350 to 00301f8 Compare May 27, 2020 17:28

caisq added the cla: yes label May 27, 2020

davidsoergel approved these changes May 27, 2020

View reviewed changes

googlebot added cla: no and removed cla: no cla: yes labels May 27, 2020

davidsoergel reviewed May 27, 2020

View reviewed changes

caisq force-pushed the 2.2 branch from 00301f8 to 4a3d0a9 Compare May 27, 2020 17:33

caisq added the cla: yes label May 27, 2020

googlebot added cla: yes and removed cla: yes cla: no labels May 27, 2020

caisq force-pushed the 2.2 branch from 4a3d0a9 to 278d3a4 Compare May 27, 2020 18:11

googlebot added cla: no and removed cla: yes labels May 27, 2020

caisq added the cla: yes label May 27, 2020

googlebot removed the cla: no label May 27, 2020

caisq added 2 commits May 27, 2020 15:24

Add 2.2.2 relnotes to RELEASE.md

0a86dba

TensorBoard 2.2.2

cfa9111

caisq force-pushed the 2.2 branch from 278d3a4 to cfa9111 Compare May 27, 2020 19:24

googlebot added cla: no and removed cla: yes labels May 27, 2020

caisq added the cla: yes label May 27, 2020

googlebot removed the cla: no label May 27, 2020

caisq merged commit a4bcc6b into tensorflow:2.2 May 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorBoard 2.2.2 #3652

TensorBoard 2.2.2 #3652

caisq commented May 19, 2020 •

edited

Loading

davidsoergel May 27, 2020

caisq May 27, 2020

googlebot commented May 27, 2020

davidsoergel May 27, 2020

caisq May 27, 2020

googlebot commented May 27, 2020

googlebot commented May 27, 2020

googlebot commented May 27, 2020

googlebot commented May 27, 2020

TensorBoard 2.2.2 #3652

TensorBoard 2.2.2 #3652

Conversation

caisq commented May 19, 2020 • edited Loading

davidsoergel May 27, 2020

Choose a reason for hiding this comment

caisq May 27, 2020

Choose a reason for hiding this comment

googlebot commented May 27, 2020

davidsoergel May 27, 2020

Choose a reason for hiding this comment

caisq May 27, 2020

Choose a reason for hiding this comment

googlebot commented May 27, 2020

googlebot commented May 27, 2020

googlebot commented May 27, 2020

googlebot commented May 27, 2020

caisq commented May 19, 2020 •

edited

Loading