Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hard coding semconv.SchemaURL references, inconsistencies and brittleness #3657

Open
jufemaiz opened this issue Mar 28, 2023 · 6 comments
Open

Comments

@jufemaiz
Copy link

Hi y'all, hopefully this isn't the wrong place to raise this discussion.

I'm making use of the AWS detectors and it's the root cause of issues I'm facing, in part because the SchemaURL of the detector is hardcoded, and the OTEL specification explicitly negates cross-schema connection of resources.

Adopting a different semconv schema url in my application's instrumentation (in this instance v1.18.0) causes breaking to occur due to:

failed to merge resources: cannot merge resource due to conflicting Schema URL
Code details
	// OTEL resource.
	resource, err := sdkresource.Detect(ctx)
	if err != nil {
		log.Printf("failed to detect lambda resources: %s\n", err)
	}

	// AWS resource.
	awsResource, err := lambdadetector.NewResourceDetector().Detect(ctx)
	if err != nil {
		log.Printf("failed to detect lambda resources: %s\n", err)
	}

	resource, err = sdkresource.Merge(resource, awsResource)
	if err != nil {
		return nil, fmt.Errorf("failed to merge resources: %w", err)
	}

	// Add the deployment environment to the resource.
	deployEnv := os.Getenv("DEPLOYMENT_ENV")
	if deployEnv == "" {
		return nil, ErrDeploymentEnvNil
	}

	attrs := []attribute.KeyValue{
		semconv.DeploymentEnvironment(deployEnv),
		semconv.ServiceInstanceID(attrService.InstanceID),
		semconv.ServiceName(attrService.Name),
		semconv.ServiceNamespace(attrService.Namespace),
		semconv.ServiceVersion(attrService.Version),
	}
	deploymentResource := sdkresource.NewWithAttributes(semconv.SchemaURL, attrs...)

	resource, err = sdkresource.Merge(deploymentResource, resource)
	if err != nil {
		return nil, fmt.Errorf("failed to merge resources: %w", err)
	}

Now, while I can appreciate there is a level of alignment required to schema changes, there should be the opportunity to either "migrate" a resource to a different schema, or at the very least, the documentation around available detectors (or anything else with explicit semconv.SchemaURL references) should be documented.

open-telemetry/opentelemetry-go#2341

As it stands, the current most up to date release of the Schema URL is v1.18.0, and there's no possible way to ask a detector (as it stands) to return a resource with a defined schema (for example, as an attribute). Not only that, the schemas used in this repo also vary (it's not all v1.17.0), and dependency updates don't seem to really support patched releases (for example, security related) for older semconv schema urls.

I have a draft PR that I'll link shortly that removes the explicit semconv.SchemaURL references, however I'm wondering if that's considered "bad form" from an Open Telemetry for Golang "best practice".

Seeking guidance and advice (other than "use v1.17.0" – until something else breaks because the brittleness of the implementation of OTEL prevents mixed schema operation.

@MrAlias
Copy link
Contributor

MrAlias commented Mar 28, 2023

I have a draft PR that I'll link shortly that removes the explicit semconv.SchemaURL references, however I'm wondering if that's considered "bad form" from an Open Telemetry for Golang "best practice".

This does not sound like changes we can accept. The schema compatibility of a resource is something defined by the specification. Not checking that compatibility would be invalid behavior.

It sounds like schema translation is what is needed here.

@jufemaiz
Copy link
Author

jufemaiz commented Mar 29, 2023

This does not sound like changes we can accept. The schema compatibility of a resource is something defined by the specification. Not checking that compatibility would be invalid behavior.

Thanks for the feedback @MrAlias. I suspected that to be the case (regarding non-acceptance).

Equally so, should there not be any "opinion" on specific versions of the schema URL adopted by packages held within here – or at the very minimum, function arguments to select a responding resource with a specific schema URL?

Examples I would highlight include:

Both these explicitly make use of the schema URL v1.17.0, thereby rendering any attempt to make use of other detectors returned resources at another version.

As noted elsewhere, an approach that hard codes schema URLs in place within packages inhibits safely upgrading applications that make use of those packages as all parts of an application will need to be amended in a single hit. Further, without having branches + tags with management for different schema URLs, security management of dependencies of this package then is tied to schema URL management.

Thinking through this the last couple of weeks it screams at me that any function returning a resource should be required, by convention within the Go implementation of OTel, to provide the requested schema URL that is desired to be used in the returned resource. Is this something that the otel-go (contrib or core) management team been investigating?

I note that even the core library explicitly sets the expected resource schema its own detector uses.

https://github.com/open-telemetry/opentelemetry-go/blob/main/sdk/resource/container.go#L50

This would tend to indicate that any other schema is invalid according to the core lib for that particular release. It would also imply, wrt semantic versioning (at the least) that a shift to v1.18.0 schema URL would be accompanied by a major release bump. Yet as we see, those bumps do occur at the minor release level (the last schema URL before change to v1.17.0, which was v1.12.0).

@MrAlias
Copy link
Contributor

MrAlias commented May 13, 2023

Here's a new repo I created to handle schema translations: https://github.com/MrAlias/otel-schema-utils

Only thing supported currently is resource translations.

@ash2k
Copy link
Contributor

ash2k commented May 22, 2023

Similar issue open-telemetry/opentelemetry-go#3769

JamieDanielson pushed a commit to honeycombio/otel-config-go that referenced this issue Aug 14, 2023
## Which problem is this PR solving?

When using detectors with mis-matched Schema URLs, resource.New will
return an error. Previously, this was dropped and consequently it looked
like configuration worked but many resource attributes would be missing
in the resulting telemetry.

With the recent addition of
#48, this error is
much easier to hit. For example, the detectors built into
opentelemetry-go
[import](https://github.com/open-telemetry/opentelemetry-go/blob/main/sdk/resource/builtin.go#L25)
semconv `v1.17.0` but this project [pulls
in](https://github.com/honeycombio/otel-config-go/blob/main/otelconfig/otelconfig.go#L21)
`v1.18.0`. This means that adding something like
`otelconfig.WithResourceOption(resource.WithHost())` results in a
mis-configured OTel and no error to warn you what happened.

This is all … very bad. This PR is really the bare minimum that needs to
be done here. That is, it at least causes a runtime error so that you
can tell what's going on — but it doesn't solve the problem that you
fundamentally cannot use this library with _any_ other library that
doesn't happen to use semconv `v1.18.0`. There are [a
number](open-telemetry/opentelemetry-go#2341)
of
[open](open-telemetry/opentelemetry-go#3769)
[issues](open-telemetry/opentelemetry-go-contrib#3657)
in OTel and adjacent repositories regarding this problem, which probably
warrant further investigation into a long-term sustainable solution.

## Short description of the changes

Return the `err` from `resource.New` up through the stack. Update tests
to expect this.

⚠️ I guess technically this could be considered a backwards incompatible
change, in that folks may have configurations that are "working" today
but will throw an error after this change. I put that in scare quotes
though because it's not _really_ working — it's just not throwing an
error. I feel like actually returning an appropriate error here and
letting folks know their configuration is likely not working as expected
is the right choice.

## How to verify that this has the expected result

There's a new test that specifically uses a detector with a different
schema URL.

Co-authored-by: Vera Reynolds <vreynolds@users.noreply.github.com>
@jufemaiz
Copy link
Author

And another round of #footgun emerges :(

#4299

@jufemaiz
Copy link
Author

jufemaiz commented Oct 17, 2023

until something else breaks because the brittleness of the implementation of OTEL prevents mixed schema operation

Here is the full rundown of the differing semconvs hard coded in the current v1.20.0 release, which shows even a release here is incompatible with itself – even within a single "detector" grouping (gcp):

v1.4.0

  • instrgen/rtlib/rtlib.go

v1.17.0

  • detectors/gcp/cloud-function.go
  • detectors/gcp/cloud-run.go
  • detectors/gcp/gce.go
  • detectors/gcp/cloud-function_test.go
  • detectors/gcp/gke.go
  • instrumentation/gopkg.in/macaron.v1/otelmacaron/internal/semconvutil/httpconv.go
  • instrumentation/gopkg.in/macaron.v1/otelmacaron/internal/semconvutil/netconv.go
  • instrumentation/gopkg.in/macaron.v1/otelmacaron/macaron.go
  • instrumentation/net/http/otelhttp/handler.go
  • instrumentation/net/http/otelhttp/test/handler_test.go
  • instrumentation/net/http/otelhttp/example/server/server.go
  • instrumentation/net/http/otelhttp/example/client/client.go
  • instrumentation/net/http/otelhttp/internal/semconvutil/httpconv.go
  • instrumentation/net/http/otelhttp/internal/semconvutil/netconv.go
  • instrumentation/net/http/httptrace/otelhttptrace/example/server/server.go
  • instrumentation/net/http/httptrace/otelhttptrace/example/client/client.go
  • instrumentation/net/http/httptrace/otelhttptrace/clienttrace.go
  • instrumentation/net/http/httptrace/otelhttptrace/internal/semconvutil/httpconv.go
  • instrumentation/net/http/httptrace/otelhttptrace/internal/semconvutil/netconv.go
  • instrumentation/net/http/httptrace/otelhttptrace/httptrace.go
  • instrumentation/net/http/httptrace/otelhttptrace/httptrace_test.go
  • instrumentation/host/example/main.go
  • instrumentation/runtime/example/main.go
  • instrumentation/google.golang.org/grpc/otelgrpc/config.go
  • instrumentation/google.golang.org/grpc/otelgrpc/test/grpc_test.go
  • instrumentation/google.golang.org/grpc/otelgrpc/test/interceptor_test.go
  • instrumentation/google.golang.org/grpc/otelgrpc/test/grpc_stats_handler_test.go
  • instrumentation/google.golang.org/grpc/otelgrpc/internal/parse_test.go
  • instrumentation/google.golang.org/grpc/otelgrpc/internal/parse.go
  • instrumentation/google.golang.org/grpc/otelgrpc/interceptor.go
  • instrumentation/google.golang.org/grpc/otelgrpc/stats_handler.go
  • instrumentation/google.golang.org/grpc/otelgrpc/semconv.go
  • instrumentation/go.mongodb.org/mongo-driver/mongo/otelmongo/mongo.go
  • instrumentation/github.com/gorilla/mux/otelmux/mux.go
  • instrumentation/github.com/gorilla/mux/otelmux/internal/semconvutil/httpconv.go
  • instrumentation/github.com/gorilla/mux/otelmux/internal/semconvutil/netconv.go
  • instrumentation/github.com/gin-gonic/gin/otelgin/internal/semconvutil/httpconv.go
  • instrumentation/github.com/gin-gonic/gin/otelgin/internal/semconvutil/netconv.go
  • instrumentation/github.com/gin-gonic/gin/otelgin/gintrace.go
  • instrumentation/github.com/labstack/echo/otelecho/internal/semconvutil/httpconv.go
  • instrumentation/github.com/labstack/echo/otelecho/internal/semconvutil/netconv.go
  • instrumentation/github.com/labstack/echo/otelecho/echo.go
  • instrumentation/github.com/emicklei/go-restful/otelrestful/internal/semconvutil/httpconv.go
  • instrumentation/github.com/emicklei/go-restful/otelrestful/internal/semconvutil/netconv.go
  • instrumentation/github.com/emicklei/go-restful/otelrestful/restful.go
  • internal/shared/semconvutil/httpconv.go.tmpl
  • internal/shared/semconvutil/netconv.go.tmpl

v1.21.0

  • detectors/gcp/detector.go
  • detectors/gcp/detector_test.go
  • detectors/aws/ecs/test/ecs_test.go
  • detectors/aws/ecs/ecs_test.go
  • detectors/aws/ecs/ecs.go
  • detectors/aws/eks/detector.go
  • detectors/aws/eks/detector_test.go
  • detectors/aws/lambda/detector.go
  • detectors/aws/lambda/detector_test.go
  • detectors/aws/ec2/ec2_test.go
  • detectors/aws/ec2/ec2.go
  • instrumentation/github.com/aws/aws-sdk-go-v2/otelaws/attributes.go
  • instrumentation/github.com/aws/aws-sdk-go-v2/otelaws/aws.go
  • instrumentation/github.com/aws/aws-sdk-go-v2/otelaws/attributes_test.go
  • instrumentation/github.com/aws/aws-sdk-go-v2/otelaws/sqsattributes_test.go
  • instrumentation/github.com/aws/aws-sdk-go-v2/otelaws/sqsattributes.go
  • instrumentation/github.com/aws/aws-sdk-go-v2/otelaws/dynamodbattributes.go
  • instrumentation/github.com/aws/aws-lambda-go/otellambda/test/lambda_test.go
  • instrumentation/github.com/aws/aws-lambda-go/otellambda/lambda.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants