Skip to content
This repository has been archived by the owner on Oct 8, 2024. It is now read-only.

Commit

Permalink
remove stale opentelemetry_sdk packages (#151)
Browse files Browse the repository at this point in the history
* remove stale opentelemetry_sdk packages

This hack mitigate an issue where the otel library's mechanism to discover resource detectors raised an Exception after a charm refresh, as described in canonical/grafana-agent-operator#146.  That was caused by a juju/charmcraft issue (https://bugs.launchpad.net/juju/+bug/2058335) where, when upgrading packages, Juju leaves behind package metadata from the old versions.  This caused otel's discovery mechanism to incorrectly use the old resource detector, leading to an exception.

The fix here is to, before the otel discovery mechanism fires, detect if we have multiple "opentelemetry_sdk" distributions (packages) and if we do, delete any that appear to be stale (where "stale" is determined by whether they present `entry_points` - this assumption worked in testing, but not sure if it is universally valid)

* feat: add guard so otel package removal only happens on upgrade-charm hooks

* debug: ignore type errors

* fix: env variable used to guard removing stale otel package

* docs: add comment describing when patch can be removed

* pr comments

---------

Co-authored-by: Pietro Pasotti <starfire.daemon@gmail.com>
  • Loading branch information
ca-scribner and PietroPasotti authored Jul 30, 2024
1 parent d20e08b commit 44e715f
Showing 1 changed file with 31 additions and 1 deletion.
32 changes: 31 additions & 1 deletion lib/charms/tempo_k8s/v1/charm_tracing.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,8 +176,10 @@ def my_tracing_endpoint(self) -> Optional[str]:
import inspect
import logging
import os
import shutil
from contextlib import contextmanager
from contextvars import Context, ContextVar, copy_context
from importlib.metadata import distributions
from pathlib import Path
from typing import (
Any,
Expand Down Expand Up @@ -217,7 +219,7 @@ def my_tracing_endpoint(self) -> Optional[str]:
# Increment this PATCH version before using `charmcraft publish-lib` or reset
# to 0 if you are raising the major API version

LIBPATCH = 11
LIBPATCH = 12

PYDEPS = ["opentelemetry-exporter-otlp-proto-http==1.21.0"]

Expand Down Expand Up @@ -359,6 +361,30 @@ def _get_server_cert(
return server_cert


def _remove_stale_otel_sdk_packages():
"""Hack to remove stale opentelemetry sdk packages from the charm's python venv.
See https://github.com/canonical/grafana-agent-operator/issues/146 and
https://bugs.launchpad.net/juju/+bug/2058335 for more context. This patch can be removed after
this juju issue is resolved and sufficient time has passed to expect most users of this library
have migrated to the patched version of juju.
This only does something if executed on an upgrade-charm event.
"""
if os.getenv("JUJU_DISPATCH_PATH") == "hooks/upgrade-charm":
logger.debug("Executing _remove_stale_otel_sdk_packages patch on charm upgrade")
# Find any opentelemetry_sdk distributions
otel_sdk_distributions = list(distributions(name="opentelemetry_sdk"))
# If there is more than 1, inspect each and if it has 0 entrypoints, infer that it is stale
if len(otel_sdk_distributions) > 1:
for distribution in otel_sdk_distributions:
if len(distribution.entry_points) == 0:
# Distribution appears to be empty. Remove it
path = distribution._path # type: ignore
logger.debug(f"Removing empty opentelemetry_sdk distribution at: {path}")
shutil.rmtree(path)


def _setup_root_span_initializer(
charm_type: _CharmType,
tracing_endpoint_attr: str,
Expand Down Expand Up @@ -391,6 +417,10 @@ def wrap_init(self: CharmBase, framework: Framework, *args, **kwargs):
_service_name = service_name or f"{self.app.name}-charm"

unit_name = self.unit.name
# apply hacky patch to remove stale opentelemetry sdk packages on upgrade-charm.
# it could be trouble if someone ever decides to implement their own tracer parallel to
# ours and before the charm has inited. We assume they won't.
_remove_stale_otel_sdk_packages()
resource = Resource.create(
attributes={
"service.name": _service_name,
Expand Down

0 comments on commit 44e715f

Please sign in to comment.