Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spike: Slow endpoints in Arc #926

Closed
chandadharap opened this issue Jan 22, 2015 · 16 comments
Closed

Spike: Slow endpoints in Arc #926

chandadharap opened this issue Jan 22, 2015 · 16 comments
Assignees

Comments

@chandadharap
Copy link

No description provided.

@seanbrookes
Copy link
Contributor

Had an initial hangout with Sam today to help frame up the spike.
Some notes:

  • the implementation is more sophisticated than the existing StrongOps one
  • ideally Chrome devtools will be a suitable 'view engine' for the feature.
  • We have a native JSON data prototype
  • Open to transforming it to a more consubable format for devtools
  • Need to deep dive into devtools timline code to figure out best way to map the api data to it.

@seanbrookes
Copy link
Contributor

From the iojs tracing discussion: nodejs/node#671 (comment)

@seanbrookes
Copy link
Contributor

Chrome tracing example git hub repo: https://github.com/thlorenz/traceviewify

@seanbrookes
Copy link
Contributor

kick-off email notes:
On Sun, Feb 15, 2015 at 10:49 PM, Chanda Dharap chanda@strongloop.com wrote:

Sean is concerned that Timeline view may not work. I have a spike for him so
he can evaluate implementation strategies and whether we use Timeline View
or not.

Ok. As a heads up, I asked Anthony, who I thought integrated the last
two dev-tools based displays (cpu and heap profiling), and he said it
was Miroslav had imported the entire chrome dev tools UI into arc...
its just that arc was disabling the display of those features that we
didn't need at the moment, like Timeline view.

Can you pass him the json that represents Slow end points?

An example of the agent-internal data format is here:
https://github.com/strongloop/strongops/pull/245#issue-53570095, along
with some fairly detailed docs on the meaning of the data.

Note that since our target is the TimeLine, after having realized that
the timeline data format is actually documented
(https://docs.google.com/a/strongloop.com/document/d/1CvAClvFfyA5R-PhYUmn5OOQtYMH4h6I0nSsKchNAySU/edit#heading=h.xqopa5m0e28f)
I think it would make more sense for agent or supervisor to transmit
and store the data in .traceview format.

I think we should do a spike on transforming the data into the trace
format described above. I might take a peek at it today.

And yes... I know I said before we should just transmit the existing
data format... but that was before I knew there was a standard format,
and not just a bunch of internal-to-chrome data structures!

I also think we should implement trace-start/trace-stop (similar to
cpu-start/cpu-stop), possibly even with watchdog mode.

Sean, have you looked at Anthony's cpu-profile display tab, and how
its using the Chrome dev tools?

I'm not sure what your concern is (other than the data format agent
tosses you), timelines looks pretty similar to the other dev tools we
rehosted into arc.

@seanbrookes
Copy link
Contributor

@seanbrookes
Copy link
Contributor

@sam-github

@sam-github
Copy link
Contributor

chrome's approach to "tracing":

thorsten's tools, around converting various formats into the "trace event" format:

@chandadharap As I understand @seanbrookes 's concerns now, its that the Chrome Dev Tools in general are incredibly complex, giving the appearance of power, but possibly just offering complexity: both complexity in terms of integrating them, but worse, complexity in terms of user experience.

My concern is that it appears google is playing hard at creating tools that can incorporate a wide range of time-based data, from stack traces, to what we call "metrics", to what we call "slow-endpoints" (they would call them traces). We have the opportunity to make a play here that will get us a tool that will allow agent to drive _many_ kinds of data into a single unified UI (rather than creating a UI for each thing we measure :-( ... ouch). It appears io.js/v8 may also be making a play towards exposing more runtime info as trace view format, so we shouldn't choose an incompatible direction.

@seanbrookes
Copy link
Contributor

the more I read the more I'm coming around

@sam-github
Copy link
Contributor

OK. A major open question is still what is the similarity and differences between the chrome://tracing view, and the "Timeline" view in Dev Tools... do they share data input formats? Is the chrome://tracing and internal tool, destined for Dev Tools once cleaned up? We appear to have a number of choices:

  1. do our own view, tailor made to our data
  2. use chrome://tracing
  3. use Timeline in devtools
  4. use concurix's (maybe an option)

I don't know if its 2 vs 3, or if they are based on same code, or... what.

@seanbrookes do you think you can figure that out? @bajtos, I wonder if you know?

@sam-github
Copy link
Contributor

@chandadharap @seanbrookes one of the issues with evaling the google displays, is they only show internal chrome FE data (by default), I'd like to do a spike (or have @seanbrookes do this) where the slow endpoint data is converted to trace view format (there are a number of options), and we load it into the view, and see what our data would look like (rather than looking at what Chrome's data looks like).

Thoughts? @seanbrookes is this something you can take on?

@bajtos
Copy link
Member

bajtos commented Feb 17, 2015

@sam-github I am not familiar with tracing/timeline views, can't help here :(

@sam-github
Copy link
Contributor

Looks like timeline is more related to .cpuprofile, and tracing to the trace data structures... but that one can be converted to the other... hm. so they are seperate choices. :-(

@altsang
Copy link
Contributor

altsang commented Feb 17, 2015

@sam-github @seanbrookes
the point of seeing if chrome dev tools can be used first is to save time and go to market quicker. Quicker - but not at the expense of sacrificing the UX and utility.
Also if JS developers were already familiar with using CDT to their front end work, that this would be very much the same. T
The above options for converting to a trace view look fine to me. What's nice about that is that we can potentially leverage for what we need to do around tracing. Note - I specifically brought up with Issac that we won't have a "trace" like graphical experience if we move to timeline within CDT (i.e. flamegraph) and that at best two different "paths" would be indented to show the parent path but that it may not be clear that they are completely unrelated to each other. He said this was fine .
If we think we can leverage one of the existing npm modules shown above - I'm all for it, especially if we can get to a flamegraph experience, but it sounds like there's more work to be had to transpose the data to a trace view?

@seanbrookes
Copy link
Contributor

my first priority is to understand the data formats required by cdt and chrome://tracing

I poked around the concurix repo's last night and was encouraged to see d3(svg) code for their flame graphs

@sam-github
Copy link
Contributor

nodejs/node#671 (comment) <---- @seanbrookes very useful context info on chrome://tracing vs. Timeline view

@chandadharap
Copy link
Author

Extremely useful spike. Basically if Concurix integration goes through, @ijroth agrees that Traces will override Slow end-points.

It is still valuable enough a direction that we should backlog a Spike for understanding Traceview format and if it would work for us. It appears io.js/v8 may also be making a play towards exposing more runtime info as trace view format. A Spike on the backlog would be helpful for the medium/long-term.

Created under scrum #191. Points back to detail on traceview here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants