Skip to content
This repository has been archived by the owner on Jan 28, 2021. It is now read-only.

Save performance metrics #137

Closed
jfontan opened this issue Mar 25, 2018 · 6 comments
Closed

Save performance metrics #137

jfontan opened this issue Mar 25, 2018 · 6 comments
Assignees
Labels
enhancement New feature or request performance Performance improvements proposal proposal for new additions or changes

Comments

@jfontan
Copy link
Contributor

jfontan commented Mar 25, 2018

It would be good to have saved performance metrics that we can use to understand the changes. As mentioned in #36 these metrics can be saved in Session or in Context.

Some of the data we can save is:

  • Time spent executing nodes
  • Memory consumption before and after node execution
  • Cache hit/miss
  • Child nodes to pinpoint slower paths

The proposal is adding a Performance interface to base session or context that can be used to save this data. It's done as an interface so we can have several implementations, some can get more info than others or disable collecting data in case it's too expensive.

type Performance interface {
	// Called once per query to reset the counters
	Reset()

	// Marks the start of one node execution, returns an id as there can be several
	// nodes running at the same time
	Start(nodeName, arguments string) (id int)

	// Marks the start of one node execution
	End(id int)

	// This can be used to save other values like cache hit and miss rates
	ExtraData(id int, name string, value int)

	// Gets the collected data in some structured format like json
	GetData() string
}

Then all the nodes have to be instrumented to save this data:

func (ib *IsBinary) Eval(
	ctx *sql.Context,
	row sql.Row,
) (interface{}, error) {
	// Start collecting data
	perID := ctx.Performance.Start("IsBinary", ib.String())
	defer ctx.Performance.End(perID)

	v, err := ib.Child.Eval(ctx, row)

	[...]

	blob, err := sql.Blob.Convert(v)
	if err != nil {
		return nil, err
	}

	// Add extra information that may be useful
	ctx.Performance.ExtraData(perID, "blob size", blob.Size())

	return isBinary(blob.([]byte)), nil
}

Extra tooling should be created to read and analyze this data. It would also be interesting to automate running tests with old and new code (for example master and PR) and comparing the data to find regressions.

Thoughts? @ajnavarro, @erizocosmico, @mcarmonaa

@jfontan jfontan added the proposal proposal for new additions or changes label Mar 25, 2018
@ajnavarro
Copy link
Contributor

ajnavarro commented Mar 26, 2018

I like the idea and it fits on the gitbase backlog. Just a couple of changes:

  • The name of the interface should be something like Metrics
  • The first implementation of that interface should be using open metrics format, on that case using https://github.com/jaegertracing/jaeger
  • Check deeply the interface methods comparing them with the jaeger API and be sure that we`ll able to use all the functionality that we want to use from jaeger.

@erizocosmico
Copy link
Contributor

erizocosmico commented Mar 26, 2018

Instead of jaeger, we could use opencensus (https://github.com/census-instrumentation/opencensus-go), which has exporters for jaeger and many more, if we want to avoid vendor lock-in.

OpenCensus already provides most of the functionality that's described in that interface, so we would not need to reinvent the wheel.

@ajnavarro
Copy link
Contributor

@smola , what do you think about use opencensus instead of jaeger? for me is ok.

@smola
Copy link
Collaborator

smola commented Mar 26, 2018

Note that jaeger provides OpenTracing API, and if you limit yourself to that, it's not vendor-locking. On the other hand, OpenCensus has its own standard and implementations only for Golang and Java (so far). Which might be good for us. It also provides wider APIs (e.g. metrics).

Whether we go with OpenTracing or OpenCensus, we can choose Jaeger/Zipkin/Other as server.

@erizocosmico
Copy link
Contributor

Which one did we ended up choosing?

@erizocosmico erizocosmico self-assigned this Mar 27, 2018
@ajnavarro
Copy link
Contributor

Jaeger because is compatible with OpenTracin API

erizocosmico added a commit to erizocosmico/go-mysql-server that referenced this issue Mar 28, 2018
Fixes src-d#137

This commit adds tracing of spans of execution in all nodes and expressions.

Since expressions and functions evaluation is immediate, spans are finished inside the `Eval` function with a defer.
In the case of plan nodes it's a bit more complex, as they just return an iterator that will be lazily evaluated. Measuring the span as just the time it takes to return the iterator would be wrong, so what it does is wrap all iterators with a `spanIter`, which iterates over an internal iterator and finishes the span when it's needed (there is an error, io.EOF is returned or `Close` is called) pushing some metrics like the number of processed rows or the error without needing the user to provide them.

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
erizocosmico added a commit to erizocosmico/go-mysql-server that referenced this issue Mar 28, 2018
Fixes src-d#137

This commit adds tracing of spans of execution in all nodes and expressions.

Since expressions and functions evaluation is immediate, spans are finished inside the `Eval` function with a defer.
In the case of plan nodes it's a bit more complex, as they just return an iterator that will be lazily evaluated. Measuring the span as just the time it takes to return the iterator would be wrong, so what it does is wrap all iterators with a `spanIter`, which iterates over an internal iterator and finishes the span when it's needed (there is an error, io.EOF is returned or `Close` is called) pushing some metrics like the number of processed rows or the error without needing the user to provide them.

Signed-off-by: Miguel Molina <miguel@erizocosmi.co>
@erizocosmico erizocosmico added enhancement New feature or request performance Performance improvements labels Mar 28, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request performance Performance improvements proposal proposal for new additions or changes
Projects
None yet
Development

No branches or pull requests

4 participants