Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing histogram measurements in Prometheus Remote Write Spec #14890

Closed
Neles786 opened this issue Feb 26, 2024 · 10 comments · Fixed by #14952
Closed

Missing histogram measurements in Prometheus Remote Write Spec #14890

Neles786 opened this issue Feb 26, 2024 · 10 comments · Fixed by #14952
Labels
bug unexpected problem or unintended behavior help wanted Request for community participation, code, contribution size/s 1 day effort, great beginniner issue

Comments

@Neles786
Copy link

Relevant telegraf.conf

[agent]
  interval      = "2s"
  omit_hostname = true
  flush_interval = "4s"

[[inputs.http_listener_v2]]
  service_address = ":1234"
  paths = ["/receive"]
  data_format = "prometheusremotewrite"

[[processors.starlark]]
source = '''
def apply(metric):
  if metric.name == "prometheus_remote_write":
      for k, v in metric.fields.items():
          metric.name = k
          metric.fields["value"] = v
          metric.fields.pop(k)
  return metric
'''

[[outputs.file]]
  files = ["stdout"]
  data_format = "influx"

Logs from Telegraf

2024-02-26T12:35:26Z I! Tags enabled: 
2024-02-26T12:35:26Z I! [agent] Config: Interval:2s, Quiet:false, Hostname:"", Flush Interval:4s
2024-02-26T12:35:26Z D! [agent] Initializing plugins
2024-02-26T12:35:26Z D! [agent] Connecting outputs
2024-02-26T12:35:26Z D! [agent] Attempting connection to [outputs.file]
2024-02-26T12:35:26Z D! [agent] Successfully connected to outputs.file
2024-02-26T12:35:26Z D! [agent] Starting service inputs
2024-02-26T12:35:26Z I! [inputs.http_listener_v2] Listening on [::]:1234
k6_http_reqs_total,expected_response=true,method=GET,name=test,proto=HTTP/1.1,scenario=default,status=200,testrun=testrun-test,url=test,workflow=CustomRun value=100 1708950930462000000
k6_checks_rate,check=GET\ Request,scenario=default,testrun=testrun-test,workflow=CustomRun value=1 1708950930462000000
k6_iterations_total,scenario=default,testrun=testrun-test,workflow=CustomRun value=100 1708950930467000000
k6_data_sent_total,scenario=default,testrun=testrun-test,workflow=CustomRun value=8000 1708950930467000000
k6_data_sent_total,group=::setup,testrun=testrun-test,workflow=CustomRun value=0 1708950929764000000
k6_data_received_total,group=::setup,testrun=testrun-test,workflow=CustomRun value=0 1708950929764000000
k6_http_req_failed_rate,expected_response=true,method=GET,name=test,proto=HTTP/1.1,scenario=default,status=200,testrun=testrun-test,url=test,workflow=CustomRun value=0 1708950930462000000
k6_data_received_total,scenario=default,testrun=testrun-test,workflow=CustomRun value=85300 1708950930467000000
2024-02-26T12:35:30Z D! [outputs.file] Wrote batch of 8 metrics in 2.166625ms

System info

Telegraf 1.28.3, macOS sonoma v14.3.1

Docker

No response

Steps to reproduce

  1. Write any K6 script(scripts) or pick from docs
  2. Run the script with native histograms enabled along with telegraf config running parallely,
    K6_PROMETHEUS_RW_SERVER_URL=http://localhost:1234/receive \ K6_PROMETHEUS_RW_TREND_AS_NATIVE_HISTOGRAM=true \ k6 run -o experimental-prometheus-rw script.js -i 100 (for example used 100 iterations)

Expected behavior

The output should have included all the native histogram metrics as described here

Actual behavior

The measurements in the telegraf stdout are only of type counter and rate, to be precise 6 different measurements devoid of histogram measurements. For multiple vus, two more measurements k6_vus and k6_vus_max are recieved.

Additional info

Ref. Link for telegraf config

@Neles786 Neles786 added the bug unexpected problem or unintended behavior label Feb 26, 2024
@powersj
Copy link
Contributor

powersj commented Feb 26, 2024

It looks like the request read by the parser is of type prombp.WriteRequest. We loop through the Timeseries element and look at the label and samples, but not the historgrams, hence why they are missing.

next steps: extend the prometheusremotewrite parser to loop through the ts.Histrograms and add a metric(s) containing the count + sum + various intervals/spans. Need to discuss how this looks so we are consistent with existing metrics. Currently in the metric building process all metrics are called prometheus_remote_write and have a single field based on the metric name.

That would look something like the following:

prometheus_remote_write,... metric_name_sum=1
prometheus_remote_write,... metric_name_count=1
prometheus_remote_write,...quantile=1 metric_name=1
...

But we need to consider how the quantiles fit into his.

@powersj powersj added help wanted Request for community participation, code, contribution size/s 1 day effort, great beginniner issue labels Feb 26, 2024
@powersj
Copy link
Contributor

powersj commented Feb 28, 2024

@Neles786,

It looks like there are a couple different methods to draw the info from and I was hoping to come up with a test case I could ensure this is working. I have put up #14907 which will print out any histograms to stdout. Would you be willing to grab that and collect some data for me?

Artifacts that you can use will be attached to the PR in 20-30mins.

Thanks!

@Neles786
Copy link
Author

Neles786 commented Mar 5, 2024

Hi @powersj thanks for the PR artifacts, I tested them got the debug logs as follows,
Screenshot 2024-03-05 at 5 14 01 PM
Only thing is metrics are needed in Influxline protocol for pushing to Influxdb.

@powersj
Copy link
Contributor

powersj commented Mar 5, 2024

Thanks for this! that helps a lot. I wasn't sure what fields are used given the variety of spans and deltas. I can look at creating a test case and can start working on a PR.

@powersj
Copy link
Contributor

powersj commented Mar 7, 2024

@Neles786 there are some new artifacts on #14952. I think this is a start, but I'm not 100% certain if I have the ranges for the buckets correct. Let me know what you think!

If you do think something is not quite right, I've printed out the ranges to stdout, that might help me or you point out what should be taking place.

Thanks!

@Neles786
Copy link
Author

Neles786 commented Mar 8, 2024

Hi @powersj the metric names, tags are missing,
Screenshot 2024-03-08 at 2 38 15 PM

@powersj
Copy link
Contributor

powersj commented Mar 8, 2024

the metric names, tags are missing,

Are you basing that out the print line format above? That was only printing the buckets and values.

If you use the outputs.file output do you see metrics?

Thanks!

@Neles786
Copy link
Author

Neles786 commented Mar 10, 2024

It was a configuration mistake from my side, all metrics are there, just a suggestion need the metric name also in the debug for the ranges, otherwise following them will be difficult

@Neles786
Copy link
Author

One more thing, technically all histogram metrics should be of cumulative frequency, right now metric values are in non cumulative, only this change is required

@powersj
Copy link
Contributor

powersj commented Mar 12, 2024

Good catch! I've pushed an update and new artifacts should be available in 20-30mins. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior help wanted Request for community participation, code, contribution size/s 1 day effort, great beginniner issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants