Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redesign of Asynchronous Instrument Registration API via the Meter #3519

Closed
MrAlias opened this issue Dec 6, 2022 · 8 comments · Fixed by #3584
Closed

Redesign of Asynchronous Instrument Registration API via the Meter #3519

MrAlias opened this issue Dec 6, 2022 · 8 comments · Fixed by #3584
Assignees
Labels
area:metrics Part of OpenTelemetry Metrics bug Something isn't working pkg:API Related to an API package
Milestone

Comments

@MrAlias
Copy link
Contributor

MrAlias commented Dec 6, 2022

Currently the metric API provides the following registration API for asynchronous instruments via a meter:

RegisterCallback(insts []instrument.Asynchronous, function func(context.Context)) error

Also, the OpenTelemetry specification states ...

Multiple-instrument Callbacks MUST be associated at the time of registration with a declared set of asynchronous instruments from the same Meter instance. This requirement that Instruments be declaratively associated with Callbacks allows an SDK to execute only those Callbacks that are necessary to evaluate instruments that are in use by a configured View.

This API allows for callbacks to observe values for instruments they are not registered for, and allows for the callback to be associated with instruments from different Meter instances.

The SDK tries to prevent this by embedding a unique token in the passed context.

ctx = context.WithValue(ctx, produceKey, struct{}{})

That producer key is not unique to a Meter meaning it does not prevent instruments from different meters from being updated. Furthermore, it means a users has to pass that context to the Observe methods they invoke. This latter point is not explicitly stated and is a potential frustration point for users that use a different context here.

Proposal

Similar to the callback design proposed in #3507, having the callback return a set of observations for the instruments it is registered with will ensure only those instruments are updated and that they are valid.

type Meter interface {
	// [...]
	RegisterCallback(f Callback, insts ...instrument.Asynchronous) error
}

type Callback func(context.Context) ([]Int64Observation, []Float64Observation, error)

type Int64Observation struct {
	Instrument instrument.Asynchronous
	Measurement Int64Measurement
}

type Int64Measurement struct {
	Attributes []attribute.KeyValue
	Value int64
}

type Float64Observation struct {
	Instrument instrument.Asynchronous
	Measurement Float64Measurement
}

type Float64Measurement struct {
	Attributes []attribute.KeyValue
	Value float64
}
@MrAlias MrAlias added bug Something isn't working pkg:API Related to an API package area:metrics Part of OpenTelemetry Metrics labels Dec 6, 2022
@MrAlias MrAlias moved this to Triage Needed in Go: Metric API (GA) Dec 6, 2022
@MrAlias MrAlias moved this from Triage Needed to Todo in Go: Metric API (GA) Dec 7, 2022
@MrAlias MrAlias moved this to Todo in Go: Metric SDK (Beta) Dec 7, 2022
@MadVikingGod
Copy link
Contributor

So in this approach, I see two things that are potential problems.

  1. The user is not supposed to use the Observe() within the callbacks. This will cause confusion, and I think is counter to the intent of the API.
  2. This will require ALL callbacks to allocate additional slices.

This additionally doesn't prevent users from using an instrument from a different meter.

@MrAlias
Copy link
Contributor Author

MrAlias commented Dec 9, 2022

The user is not supposed to use the Observe() within the callbacks. This will cause confusion, and I think is counter to the intent of the API.

The API specifically states this as a valid callback design:

Return a list (or tuple, generator, enumerator, etc.) of individual Measurement values.

It is also how other implementations, e.g. python, are designed.

I don't follow how this isn't the intent of the API. Can you elaborate?

This additionally doesn't prevent users from using an instrument from a different meter.

If the SDK is the one doing the observation, why can it not verify the instrument belongs to the appropriate meter in the process?

@MrAlias
Copy link
Contributor Author

MrAlias commented Dec 10, 2022

I'm looking into an alternate proposal that uses the runtime.Caller function to ensure the callback is in the call-stack when Observe is called.

@MrAlias
Copy link
Contributor Author

MrAlias commented Dec 10, 2022

Alternate proposal to use the runtime.Caller function to ensure the Observe method is called from a Meter:

https://go.dev/play/p/uF-Utm0S5h6

type Meter struct {
	name          string
	registrations []registration
}

func NewMeter(name string) *Meter { return &Meter{name: name} }

func (m *Meter) Instrument() *Instrument {
	return &Instrument{
		meter:   m.name,
		callers: make([]uintptr, 8),
	}
}

func (m *Meter) Register(f Callback, instruments ...*Instrument) error {
	for _, i := range instruments {
		if i.meter != m.name {
			return fmt.Errorf("instrument from another meter: %s", i.meter)
		}
	}
	m.registrations = append(m.registrations, registration{
		Instrument: instruments,
		Callback:   f,
	})
	return nil
}

func (m *Meter) Collect() error {
	for _, reg := range m.registrations {
		if err := reg.collect(); err != nil {
			return err
		}
	}
	return nil
}

type Callback func() error

type registration struct {
	Instrument []*Instrument
	Callback   Callback
}

func (r registration) collect() error {
	pc, _, _, ok := runtime.Caller(1)
	if !ok {
		return errors.New("failed to get program counter")
	}
	for _, i := range r.Instrument {
		i.caller = pc
	}
	err := r.Callback()
	for _, i := range r.Instrument {
		i.caller = 0
	}
	return err
}

func contains(pc uintptr, frames *runtime.Frames) bool {
	for {
		frame, more := frames.Next()
		if pc == frame.PC {
			return true
		}
		if !more {
			break
		}
	}
	return false
}

type Instrument struct {
	meter string

	caller  uintptr // TODO: locking and multiple callers.
	callers []uintptr
}

func (i *Instrument) Observe() error {
	if i.caller == 0 {
		return errors.New("Observe must be called from a registered callback")
	}
	var stackIdx int
	for {
		n := runtime.Callers(1+stackIdx, i.callers)
		if contains(i.caller, runtime.CallersFrames(i.callers)) {
			// Called from a reservation.
			fmt.Println("Instrument observed")
			return nil
		}
		if n != len(i.callers) {
			break
		}
		stackIdx += n
	}
	return errors.New("Observe must be called from the callback it is registered with")
}

@MadVikingGod
Copy link
Contributor

There seems to be a very high overhead cost to this approach.

I think we can expect the users to use the context in the callback in the Observe calls. Could we do something simpler like use a random number when the meter is created and pass that in the context?

@MrAlias
Copy link
Contributor Author

MrAlias commented Dec 15, 2022

There seems to be a very high overhead cost to this approach.

Can you elaborate on this? What were your findings?

@MrAlias
Copy link
Contributor Author

MrAlias commented Jan 5, 2023

From #3373

Idiomatic APIs for multiple-instrument Callbacks MUST distinguish the instrument associated with each observed Measurement value.

Our API currently does not do this. It assumes the association is correctly done within a Callback by a user.

This seems to motivate us resolving the redesign here to ensure we do comply with this.

@MrAlias
Copy link
Contributor Author

MrAlias commented Jan 7, 2023

From #3380

It is RECOMMENDED that the API authors use one of the following forms for the callback function:

  • The list (or tuple, etc.) returned by the callback function contains (Instrument, Measurement) pairs.
  • the Observable Result parameter receives an additional (Instrument, Measurement) pairs

We do not follow this recommendation from the specification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:metrics Part of OpenTelemetry Metrics bug Something isn't working pkg:API Related to an API package
Projects
No open projects
Status: Done
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants