Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

graph] Use 0 as the minimum memory/time in the color scale. #437

Merged
merged 2 commits into from
Sep 4, 2017

Conversation

ioeric
Copy link
Contributor

@ioeric ioeric commented Aug 28, 2017

When calculating the color scale for the compute time, consider all nodes in the graph instead of just the nodes in the top-level graph.

I have a graph with one single sub-graph, which consumes all the compute time (say X ms), in the top-level graph, so the color scale becomes "X ms -> X ms".

Not sure about the original intention of the top-level graph; let me know if this is the right fix.

(I don't really know typescript. Please let me know if there is more efficient way to get the extent of a map :)

@ioeric
Copy link
Contributor Author

ioeric commented Aug 28, 2017

@chihuahua Mind having a look at this?

@wchargin
Copy link
Contributor

wchargin commented Aug 28, 2017

Just passing by, but…

Please let me know if there is more efficient way to get the extent of a map

The cleanest way is probably to take advantage of the fact that Math.min(2, 4, 1, 3, 5) === 1, so

const micros = this.hierarchy.getNodeMap().map(node => node.micros);
const minMicros = Math.min(...micros);  // ES5: Math.min.apply(null, micros)
const maxMicros = Math.max(...micros);  // ES5: Math.max.apply(null, micros)

Note that Math.min() === Infinity and Math.max() === -Infinity, not null. (This is probably a good thing, but you can always guard against it with isFinite if you want the nulls.)

Just for fun…if by “more efficient” you mean not “cleaner code” but “fewest comparisons,” then you will need to compute the min and max simultaneously. This requires (3 / 2) n − 2 comparisons (for even n) and is provably optimal. Here is a simple algorithm:

  1. Group the elements x1, …, xn into pairs: (x1, x2), …, (xn−1, xn). (If n is odd, you can create (xn, xn), or modify the rest of the algorithm in the evident way.)
  2. Create a list X containing the smaller value from each pair, and a list Y containing the larger value from each pair. This takes n / 2 comparisons.
  3. Find the minimum of X and the maximum of Y; these must be the global minimum and maximum. This takes 2 (n / 2 − 1) = n − 2 comparisons.
    But this is surely not actually worth implementing; in addition to the fact that it only saves you about n / 2 comparisons, it’s probably slower due to caching, anyway. :-)

@chihuahua
Copy link
Member

@ioeric, could you actually break this up into 2 PRs? Thanks! :) It's a bit cleaner that way. For instance, if we later find compute time calculations to broken, we won't have to also rollback logic for naming nodes.

@ioeric
Copy link
Contributor Author

ioeric commented Aug 29, 2017

@chihuahua Done. Pulled the strict name change into #440

@ioeric
Copy link
Contributor Author

ioeric commented Aug 29, 2017

@wchargin Thanks for the suggestion and analysis!

It seems that using the math library would require filtering (both nodeStats and micros can be null) plus two passes of scanning. Maybe a hand-written loop isn't too bad after all?

@jart jart requested a review from chihuahua August 29, 2017 20:26
Copy link
Member

@chihuahua chihuahua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

let computeTimeExtent = d3.extent(topLevelGraph.nodes(),
(nodeName, index) => {
let node = topLevelGraph.node(nodeName);
// Find the maximum and minimum compute time in the whole graph.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment here noting that, furthermore, the total compute time for a metanode is the sum of the compute time of all nodes within it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it's the max compute time among all nodes within it? At least this is what I saw in the graph I tested with.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic within the _.each block below finds the max getTotalMicros throughout the graph. In turn, getTotalMicros is the sum of compute time for all nodes underneath it (for all group nodes).

* Sum of all children if it is a Group node. Null if it is unknown.

Come to think, I believe that means we don't need a complete graph traversal at all because top-level metanodes must have greater compute time than any of its children.

Maybe we can ignore metanodes? ie,

if (node.type === NodeType.META) {
  return;
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, lets not ignore metanodes. They get colored too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you can simply do is iterate through the children of the root (the top-level nodes) because group nodes already include the compute times of their children (as before). I think this change could actually be reverted. Sorry, I didn't notice earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we also need the minimum to calculate the extent right? Does top-level graph have this information too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, sorry. I was out for a while. I think the scale should just range from 0 to max for the sake of graphical integrity. Otherwise, the colors might vary a lot, but the range in compute time might actually be very small.

It's similar to the reason that I prefer bar charts to start at 0:
https://flowingdata.com/2015/08/31/bar-chart-baselines-start-at-zero/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really have strong opinion as long as it works. I'll change the code to scale from 0.

@ioeric
Copy link
Contributor Author

ioeric commented Sep 1, 2017

Ping. As mentioned in the inline comment, I still think the minimum compute time is not calculated correctly.

@ioeric ioeric changed the title [graph] Trying to fix node stats rendering issue. graph] Use 0 as the minimum memory/time in the color scale. Sep 4, 2017
@ioeric
Copy link
Contributor Author

ioeric commented Sep 4, 2017

PTAL

Copy link
Member

@chihuahua chihuahua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested, and it LGTM. Thank you for the fix.

@chihuahua chihuahua merged commit 9df1cd4 into tensorflow:master Sep 4, 2017
jart pushed a commit to jart/tensorboard that referenced this pull request Sep 23, 2017
…w#437)

Using 0 as the lower bound for the range seems more in line with graphical integrity.
jart pushed a commit that referenced this pull request Sep 26, 2017
Using 0 as the lower bound for the range seems more in line with graphical integrity.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants