Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[EXPORTER] Prometheus: Add unit to names, convert to word #2213

Merged
merged 17 commits into from
Nov 22, 2023
Merged
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,10 @@

#include <prometheus/metric_family.h>
#include <string>
#include <unordered_map>
#include <vector>
#include "opentelemetry/metrics/provider.h"
#include "opentelemetry/nostd/string_view.h"
#include "opentelemetry/sdk/metrics/meter.h"
#include "opentelemetry/version.h"

Expand All @@ -28,7 +30,7 @@ class PrometheusExporterUtils
* @param records a collection of metrics in OpenTelemetry
* @return a collection of translated metrics that is acceptable by Prometheus
*/
static std::vector<::prometheus::MetricFamily> TranslateToPrometheus(
static std::unordered_map<std::string, ::prometheus::MetricFamily> TranslateToPrometheus(
const sdk::metrics::ResourceMetrics &data);

private:
Expand All @@ -41,6 +43,79 @@ class PrometheusExporterUtils
*/
static std::string SanitizeNames(std::string name);

static std::string MapToPrometheusName(const std::string &name,
const std::string &unit,
::prometheus::MetricType prometheus_type);

/**
* A utility function that returns the equivalent Prometheus name for the provided OTLP metric
* unit.
*
* @param raw_metric_unitName The raw metric unit for which Prometheus metric unit needs to be
* computed.
* @return the computed Prometheus metric unit equivalent of the OTLP metric un
esigo marked this conversation as resolved.
Show resolved Hide resolved
*/
static std::string GetEquivalentPrometheusUnit(const std::string &raw_metric_unitName);

/**
* This method retrieves the expanded Prometheus unit name for known abbreviations. OTLP metrics
* use the c/s notation as specified at <a href="https://ucum.org/ucum.html">UCUM</a>. The list of
* mappings is adopted from <a
* href="https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/9a9d4778bbbf242dba233db28e2fbcfda3416959/pkg/translator/prometheus/normalize_name.go#L30">OpenTelemetry
* Collector Contrib</a>.
*
* @param unit_abbreviation The unit that name that needs to be expanded/converted to Prometheus
* units.
* @return The expanded/converted unit name if known, otherwise returns the input unit name as-is.
*/
static std::string GetPrometheusUnit(const std::string &unit_abbreviation);

/**
* This method retrieves the expanded Prometheus unit name to be used with "per" units for known
* units. For example: s => per second (singular)
*
* @param per_unit_abbreviation The unit abbreviation used in a 'per' unit.
* @return The expanded unit equivalent to be used in 'per' unit if the input is a known unit,
* otherwise returns the input as-is.
*/
static std::string GetPrometheusPerUnit(const std::string &per_unit_abbreviation);

/**
* Replaces all characters that are not a letter or a digit with '_' to make the resulting string
* Prometheus compliant. This method also removes leading and trailing underscores - this is done
* to keep the resulting unit similar to what is produced from the collector's implementation.
*
* @param str The string input that needs to be made Prometheus compliant.
* @return the cleaned-up Prometheus compliant string.
*/
static std::string CleanUpString(const std::string &str);

/**
* This method is used to convert the units expressed as a rate via '/' symbol in their name to
* their expanded text equivalent. For instance, km/h => km_per_hour. The method operates on the
* input by splitting it in 2 parts - before and after '/' symbol and will attempt to expand any
* known unit abbreviation in both parts. Unknown abbreviations & unsupported characters will
* remain unchanged in the final output of this function.
*
* @param rate_expressed_unit The rate unit input that needs to be converted to its text
* equivalent.
* @return The text equivalent of unit expressed as rate. If the input does not contain '/', the
* function returns it as-is.
*/
static std::string ConvertRateExpressedToPrometheusUnit(const std::string &rate_expressed_unit);

/**
* This method drops all characters enclosed within '{}' (including the curly braces) by replacing
* them with an empty string. Note that this method will not produce the intended effect if there
* are nested curly braces within the outer enclosure of '{}'.
*
* <p>For instance, {packet{s}s} => s}.
*
* @param unit The input unit from which text within curly braces needs to be removed.
* @return The resulting unit after removing the text within '{}'.
*/
static std::string RemoveUnitPortionInBraces(const std::string &unit);

static opentelemetry::sdk::metrics::AggregationType getAggregationType(
const opentelemetry::sdk::metrics::PointType &point_type);

Expand Down
2 changes: 1 addition & 1 deletion exporters/prometheus/src/collector.cc
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ std::vector<prometheus_client::MetricFamily> PrometheusCollector::Collect() cons
reader_->Collect([&result](sdk::metrics::ResourceMetrics &metric_data) {
auto prometheus_metric_data = PrometheusExporterUtils::TranslateToPrometheus(metric_data);
for (auto &data : prometheus_metric_data)
result.emplace_back(data);
result.emplace_back(data.second);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it better to use std::move(data.second) here? ::prometheus::MetricFamily may contains a lot of data, and copy these data may have performance problem.

return true;
});
collection_lock_.unlock();
Expand Down
209 changes: 185 additions & 24 deletions exporters/prometheus/src/exporter_utils.cc
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
// Copyright The OpenTelemetry Authors
// SPDX-License-Identifier: Apache-2.0

#include <regex>
#include <sstream>
#include <string>
#include <utility>
#include <vector>
#include "prometheus/metric_family.h"
Expand All @@ -27,35 +29,42 @@ namespace metrics
* @param records a collection of metrics in OpenTelemetry
* @return a collection of translated metrics that is acceptable by Prometheus
*/
std::vector<prometheus_client::MetricFamily> PrometheusExporterUtils::TranslateToPrometheus(
const sdk::metrics::ResourceMetrics &data)
std::unordered_map<std::string, prometheus_client::MetricFamily>
PrometheusExporterUtils::TranslateToPrometheus(const sdk::metrics::ResourceMetrics &data)
{

// initialize output vector
std::vector<prometheus_client::MetricFamily> output;
std::unordered_map<std::string, prometheus_client::MetricFamily> output;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use output.reserve() first to reduce rehash?


for (const auto &instrumentation_info : data.scope_metric_data_)
{
for (const auto &metric_data : instrumentation_info.metric_data_)
{
auto origin_name = metric_data.instrument_descriptor.name_;
auto unit = metric_data.instrument_descriptor.unit_;
auto sanitized = SanitizeNames(origin_name);
prometheus_client::MetricFamily metric_family;
metric_family.name = sanitized + "_" + unit;
metric_family.help = metric_data.instrument_descriptor.description_;
auto time = metric_data.end_ts.time_since_epoch();
if (metric_data.point_data_attr_.empty())
{
continue;
}
auto time = metric_data.end_ts.time_since_epoch();
auto front = metric_data.point_data_attr_.front();
auto kind = getAggregationType(front.point_data);
bool is_monotonic = true;
if (kind == sdk::metrics::AggregationType::kSum)
{
is_monotonic = nostd::get<sdk::metrics::SumPointData>(front.point_data).is_monotonic_;
}
const prometheus_client::MetricType type = TranslateType(kind, is_monotonic);
auto mf_name = MapToPrometheusName(metric_data.instrument_descriptor.name_,
metric_data.instrument_descriptor.unit_, type);
auto emp_res = output.emplace(std::make_pair(mf_name, prometheus_client::MetricFamily{}));
auto *metric_family = &emp_res.first->second;
if (emp_res.second)
{
metric_family->name = mf_name;
metric_family->type = type;
metric_family->help = metric_data.instrument_descriptor.description_;
}
for (const auto &point_data_attr : metric_data.point_data_attr_)
{
auto kind = getAggregationType(point_data_attr.point_data);
bool is_monotonic = true;
if (kind == sdk::metrics::AggregationType::kSum)
{
is_monotonic =
nostd::get<sdk::metrics::SumPointData>(point_data_attr.point_data).is_monotonic_;
}
const prometheus_client::MetricType type = TranslateType(kind, is_monotonic);
metric_family.type = type;
if (type == prometheus_client::MetricType::Histogram) // Histogram
{
auto histogram_point_data =
Expand All @@ -72,7 +81,7 @@ std::vector<prometheus_client::MetricFamily> PrometheusExporterUtils::TranslateT
sum = nostd::get<int64_t>(histogram_point_data.sum_);
}
SetData(std::vector<double>{sum, (double)histogram_point_data.count_}, boundaries, counts,
point_data_attr.attributes, time, &metric_family);
point_data_attr.attributes, time, metric_family);
}
else if (type == prometheus_client::MetricType::Gauge)
{
Expand All @@ -82,14 +91,14 @@ std::vector<prometheus_client::MetricFamily> PrometheusExporterUtils::TranslateT
auto last_value_point_data =
nostd::get<sdk::metrics::LastValuePointData>(point_data_attr.point_data);
std::vector<metric_sdk::ValueType> values{last_value_point_data.value_};
SetData(values, point_data_attr.attributes, type, time, &metric_family);
SetData(values, point_data_attr.attributes, type, time, metric_family);
}
else if (nostd::holds_alternative<sdk::metrics::SumPointData>(point_data_attr.point_data))
{
auto sum_point_data =
nostd::get<sdk::metrics::SumPointData>(point_data_attr.point_data);
std::vector<metric_sdk::ValueType> values{sum_point_data.value_};
SetData(values, point_data_attr.attributes, type, time, &metric_family);
SetData(values, point_data_attr.attributes, type, time, metric_family);
}
else
{
Expand All @@ -105,7 +114,7 @@ std::vector<prometheus_client::MetricFamily> PrometheusExporterUtils::TranslateT
auto sum_point_data =
nostd::get<sdk::metrics::SumPointData>(point_data_attr.point_data);
std::vector<metric_sdk::ValueType> values{sum_point_data.value_};
SetData(values, point_data_attr.attributes, type, time, &metric_family);
SetData(values, point_data_attr.attributes, type, time, metric_family);
}
else
{
Expand All @@ -115,7 +124,6 @@ std::vector<prometheus_client::MetricFamily> PrometheusExporterUtils::TranslateT
}
}
}
output.emplace_back(metric_family);
}
}
return output;
Expand Down Expand Up @@ -167,6 +175,159 @@ std::string PrometheusExporterUtils::SanitizeNames(std::string name)
return name;
}

std::regex INVALID_CHARACTERS_PATTERN("[^a-zA-Z0-9]");
std::regex CHARACTERS_BETWEEN_BRACES_PATTERN("\\{(.*?)\\}");
std::regex SANITIZE_LEADING_UNDERSCORES("^_+");
std::regex SANITIZE_TRAILING_UNDERSCORES("_+$");
std::regex SANITIZE_CONSECUTIVE_UNDERSCORES("[_]{2,}");

std::string PrometheusExporterUtils::GetEquivalentPrometheusUnit(
const std::string &raw_metric_unit_name)
{
if (raw_metric_unit_name.empty())
{
return raw_metric_unit_name;
}

std::string converted_metric_unit_name = RemoveUnitPortionInBraces(raw_metric_unit_name);
converted_metric_unit_name = ConvertRateExpressedToPrometheusUnit(converted_metric_unit_name);

return CleanUpString(GetPrometheusUnit(converted_metric_unit_name));
}

std::string PrometheusExporterUtils::GetPrometheusUnit(const std::string &unit_abbreviation)
{
static std::unordered_map<std::string, std::string> units{// Time
{"d", "days"},
{"h", "hours"},
{"min", "minutes"},
{"s", "seconds"},
{"ms", "milliseconds"},
{"us", "microseconds"},
{"ns", "nanoseconds"},
// Bytes
{"By", "bytes"},
{"KiBy", "kibibytes"},
{"MiBy", "mebibytes"},
{"GiBy", "gibibytes"},
{"TiBy", "tibibytes"},
{"KBy", "kilobytes"},
{"MBy", "megabytes"},
{"GBy", "gigabytes"},
{"TBy", "terabytes"},
{"By", "bytes"},
{"KBy", "kilobytes"},
{"MBy", "megabytes"},
{"GBy", "gigabytes"},
{"TBy", "terabytes"},
// SI
{"m", "meters"},
{"V", "volts"},
{"A", "amperes"},
{"J", "joules"},
{"W", "watts"},
{"g", "grams"},
// Misc
{"Cel", "celsius"},
{"Hz", "hertz"},
{"1", ""},
{"%", "percent"}};
auto res_it = units.find(unit_abbreviation);
if (res_it == units.end())
{
return unit_abbreviation;
}
return res_it->second;
}

std::string PrometheusExporterUtils::GetPrometheusPerUnit(const std::string &per_unit_abbreviation)
{
static std::unordered_map<std::string, std::string> per_units{
{"s", "second"}, {"m", "minute"}, {"h", "hour"}, {"d", "day"},
{"w", "week"}, {"mo", "month"}, {"y", "year"}};
auto res_it = per_units.find(per_unit_abbreviation);
if (res_it == per_units.end())
{
return per_unit_abbreviation;
}
return res_it->second;
}

std::string PrometheusExporterUtils::RemoveUnitPortionInBraces(const std::string &unit)
{
return std::regex_replace(unit, CHARACTERS_BETWEEN_BRACES_PATTERN, "");
}

std::string PrometheusExporterUtils::ConvertRateExpressedToPrometheusUnit(
const std::string &rate_expressed_unit)
{
size_t pos = rate_expressed_unit.find("/");
if (pos == std::string::npos)
{
return rate_expressed_unit;
}

std::vector<std::string> rate_entities;
rate_entities.push_back(rate_expressed_unit.substr(0, pos));
rate_entities.push_back(rate_expressed_unit.substr(pos + 1));

if (rate_entities[1].empty())
{
return rate_expressed_unit;
}

std::string prometheus_unit = GetPrometheusUnit(rate_entities[0]);
std::string prometheus_per_unit = GetPrometheusPerUnit(rate_entities[1]);

return prometheus_unit + "_per_" + prometheus_per_unit;
}

std::string PrometheusExporterUtils::CleanUpString(const std::string &str)
{
std::string cleaned_string = std::regex_replace(str, INVALID_CHARACTERS_PATTERN, "_");
cleaned_string = std::regex_replace(cleaned_string, SANITIZE_CONSECUTIVE_UNDERSCORES, "_");
cleaned_string = std::regex_replace(cleaned_string, SANITIZE_TRAILING_UNDERSCORES, "");
cleaned_string = std::regex_replace(cleaned_string, SANITIZE_LEADING_UNDERSCORES, "");

return cleaned_string;
}

std::string PrometheusExporterUtils::MapToPrometheusName(
const std::string &name,
const std::string &unit,
prometheus_client::MetricType prometheus_type)
{
auto sanitized_name = SanitizeNames(name);
std::string prometheus_equivalent_unit = GetEquivalentPrometheusUnit(unit);

// Append prometheus unit if not null or empty.
if (!prometheus_equivalent_unit.empty() &&
sanitized_name.find(prometheus_equivalent_unit) == std::string::npos)
{
sanitized_name += "_" + prometheus_equivalent_unit;
}

// Special case - counter
if (prometheus_type == prometheus_client::MetricType::Counter)
{
auto t_pos = sanitized_name.rfind("_total");
bool ends_with_total = t_pos == sanitized_name.size() - 6;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this work if the metric has a unit? E.g. if I have a metric: foo.bar.total, with unit s, will I get foo_bar_total_seconds_total or foo_bar_seconds_total?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dashpole the output will be foo_bar_total_seconds_total.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, you would trim _total from counters before appending the unit, and then always add _total after the unit for counters.

if (!ends_with_total)
{
sanitized_name += "_total";
}
}

// Special case - gauge
if (unit == "1" && prometheus_type == prometheus_client::MetricType::Gauge &&
sanitized_name.find("ratio") == std::string::npos)
{
sanitized_name += "_ratio";
}

return CleanUpString(SanitizeNames(sanitized_name));
}

metric_sdk::AggregationType PrometheusExporterUtils::getAggregationType(
const metric_sdk::PointType &point_type)
{
Expand Down
Loading