Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: New Intel Baseband Accelerator Input Plugin #13397

Merged
merged 5 commits into from
Jun 9, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions plugins/inputs/all/intel_baseband.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
//go:build !custom || inputs || inputs.intel_baseband

package all

import _ "github.com/influxdata/telegraf/plugins/inputs/intel_baseband" // register plugin
127 changes: 127 additions & 0 deletions plugins/inputs/intel_baseband/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Intel Baseband Accelerator Input Plugin

Intel Baseband Accelerator Input Plugin collects metrics from both dedicated and
integrated Intel devices that provide Wireless Baseband hardware acceleration.
These devices play a key role in accelerating 5G and 4G Virtualized Radio Access
Networks (vRAN) workloads, increasing the overall compute capacity of
a commercial, off-the-shelf platforms.

Intel Baseband devices integrate various features critical for 5G and
LTE (Long Term Evolution) networks, including e.g.:

- Forward Error Correction (FEC) processing,
- 4G Turbo FEC processing,
- 5G Low Density Parity Check (LDPC)
- a Fast Fourier Transform (FFT) block providing DFT/iDFT processing offload
for the 5G Sounding Reference Signal (SRS)

Supported hardware:

- Intel® vRAN Boost integrated accelerators:
- 4th Gen Intel® Xeon® Scalable processor with Intel® vRAN Boost (also known as Sapphire Rapids Edge Enhanced / SPR-EE)
- External expansion cards connected to the PCI bus:
- Intel® vRAN Dedicated Accelerator ACC100 SoC (code named Mount Bryce)
- Intel® vRAN Dedicated Accelerator ACC101 SoC (code named Mount Cirrus)

## Prerequisites

- Intel Baseband device installed and configured.
- Minimum Linux kernel version required is 5.7.
- [pf-bb-config](https://github.com/intel/pf-bb-config) (version >= v23.03) installed and running.

For more information regarding system configuration, please follow DPDK
installation guides:

- [Intel® vRAN Boost Poll Mode Driver (PMD)][VRB1]
- [Intel® ACC100 and ACC101 5G/4G FEC Poll Mode Drivers][ACC100/ACC101]

[VRB1]: https://doc.dpdk.org/guides/bbdevs/vrb1.html#installation
[ACC100/ACC101]: https://doc.dpdk.org/guides/bbdevs/acc100.html#installation

## Global configuration options <!-- @/docs/includes/plugin_config.md -->

In addition to the plugin-specific configuration settings, plugins support
additional global and plugin configuration settings. These settings are used to
modify metrics, tags, and field or create aliases and configure ordering, etc.
See the [CONFIGURATION.md][CONFIGURATION.md] for more details.

[CONFIGURATION.md]: ../../../docs/CONFIGURATION.md#plugins

## Configuration

```toml @sample.conf
# Intel Baseband Accelerator Input Plugin collects metrics from both dedicated and integrated
# Intel devices that provide Wireless Baseband hardware acceleration.
# This plugin ONLY supports Linux.
[[inputs.intel_baseband]]
## Path to socket exposed by pf-bb-config for CLI interaction (mandatory).
## In version v23.03 of pf-bb-config the path is created according to the schema:
## "/tmp/pf_bb_config.0000\:<b>\:<d>.<f>.sock" where 0000\:<b>\:<d>.<f> is the PCI device ID.
socket_path = ""

## Path to log file exposed by pf-bb-config with telemetry to read (mandatory).
## In version v23.03 of pf-bb-config the path is created according to the schema:
## "/var/log/pf_bb_cfg_0000\:<b>\:<d>.<f>.log" where 0000\:<b>\:<d>.<f> is the PCI device ID.
log_file_path = ""

## Specifies plugin behavior regarding unreachable socket (which might not have been initialized yet).
## Available choices:
## - error: Telegraf will return an error on startup if socket is unreachable
## - ignore: Telegraf will ignore error regarding unreachable socket on both startup and gather
# unreachable_socket_behavior = "error"

## Duration that defines how long the connected socket client will wait for
## a response before terminating connection.
## Since it's local socket access to a fast packet processing application, the timeout should
## be sufficient for most users.
## Setting the value to 0 disables the timeout (not recommended).
# socket_access_timeout = "1s"

## Duration that defines maximum time plugin will wait for pf-bb-config to write telemetry to the log file.
## Timeout may differ depending on the environment.
## Must be equal or larger than 50ms.
# wait_for_telemetry_timeout = "1s"
```

## Metrics

Depending on version of Intel Baseband device and version of pf-bb-config,
subset of following measurements may be exposed:

**The following tags and fields are supported by Intel Baseband plugin:**

| Tag | Description |
|-------------|-------------------------------------------------------------|
| `metric` | Type of metric : "code_blocks", "data_bytes", "per_engine". |
| `operation` | Type of operation: "5GUL", "5GDL", "4GUL", "4GDL", "FFT". |
| `vf` | Virtual Function number. |
| `engine` | Engine number. |

| Metric name (field) | Description |
|----------------------|-------------------------------------------------------------------|
| `value` | Metric value for a given operation (non-negative integer, gauge). |

## Example Output

```text
intel_baseband,host=ubuntu,metric=code_blocks,operation=5GUL,vf=0 value=54i 1685695885000000000
intel_baseband,host=ubuntu,metric=code_blocks,operation=5GDL,vf=0 value=0i 1685695885000000000
intel_baseband,host=ubuntu,metric=code_blocks,operation=FFT,vf=0 value=0i 1685695885000000000
intel_baseband,host=ubuntu,metric=code_blocks,operation=5GUL,vf=1 value=0i 1685695885000000000
intel_baseband,host=ubuntu,metric=code_blocks,operation=5GDL,vf=1 value=32i 1685695885000000000
intel_baseband,host=ubuntu,metric=code_blocks,operation=FFT,vf=1 value=0i 1685695885000000000
intel_baseband,host=ubuntu,metric=data_bytes,operation=5GUL,vf=0 value=18560i 1685695885000000000
intel_baseband,host=ubuntu,metric=data_bytes,operation=5GDL,vf=0 value=0i 1685695885000000000
intel_baseband,host=ubuntu,metric=data_bytes,operation=FFT,vf=0 value=0i 1685695885000000000
intel_baseband,host=ubuntu,metric=data_bytes,operation=5GUL,vf=1 value=0i 1685695885000000000
intel_baseband,host=ubuntu,metric=data_bytes,operation=5GDL,vf=1 value=86368i 1685695885000000000
intel_baseband,host=ubuntu,metric=data_bytes,operation=FFT,vf=1 value=0i 1685695885000000000
intel_baseband,engine=0,host=ubuntu,metric=per_engine,operation=5GUL value=72i 1685695885000000000
intel_baseband,engine=1,host=ubuntu,metric=per_engine,operation=5GUL value=72i 1685695885000000000
intel_baseband,engine=2,host=ubuntu,metric=per_engine,operation=5GUL value=72i 1685695885000000000
intel_baseband,engine=3,host=ubuntu,metric=per_engine,operation=5GUL value=72i 1685695885000000000
intel_baseband,engine=4,host=ubuntu,metric=per_engine,operation=5GUL value=72i 1685695885000000000
intel_baseband,engine=0,host=ubuntu,metric=per_engine,operation=5GDL value=132i 1685695885000000000
intel_baseband,engine=1,host=ubuntu,metric=per_engine,operation=5GDL value=130i 1685695885000000000
intel_baseband,engine=0,host=ubuntu,metric=per_engine,operation=FFT value=0i 1685695885000000000
```
227 changes: 227 additions & 0 deletions plugins/inputs/intel_baseband/intel_baseband.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
//go:generate ../../../tools/readme_config_includer/generator
//go:build linux && amd64

package intel_baseband

import (
_ "embed"
"errors"
"fmt"
"time"

"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/config"
"github.com/influxdata/telegraf/plugins/inputs"
)

const (
// plugin name. Exposed with all metrics
pluginName = "intel_baseband"

// VF Metrics
vfCodeBlocks = "Code Blocks"
vfDataBlock = "Data (Bytes)"

// Engine Metrics
engineBlock = "Per Engine"

// Socket extensions
socketExtension = ".sock"
logFileExtension = ".log"

// UnreachableSocketBehavior Values
unreachableSocketBehaviorError = "error"
unreachableSocketBehaviorIgnore = "ignore"

defaultAccessSocketTimeout = config.Duration(time.Second)
defaultWaitForTelemetryTimeout = config.Duration(time.Second)
)

//go:embed sample.conf
var sampleConfig string

type Baseband struct {
// required params
SocketPath string `toml:"socket_path"`
FileLogPath string `toml:"log_file_path"`

//optional params
UnreachableSocketBehavior string `toml:"unreachable_socket_behavior"`
SocketAccessTimeout config.Duration `toml:"socket_access_timeout"`
WaitForTelemetryTimeout config.Duration `toml:"wait_for_telemetry_timeout"`

Log telegraf.Logger `toml:"-"`
logConn *logConnector
sockConn *socketConnector
}

func (b *Baseband) SampleConfig() string {
return sampleConfig
}

// Init performs one time setup of the plugin
func (b *Baseband) Init() error {
if b.SocketAccessTimeout < 0 {
return fmt.Errorf("socket_access_timeout should be positive number or equal to 0 (to disable timeouts)")
}

waitForTelemetryDuration := time.Duration(b.WaitForTelemetryTimeout)
if waitForTelemetryDuration < 50*time.Millisecond {
return fmt.Errorf("wait_for_telemetry_timeout should be equal or larger than 50ms")
}

// Filling default values
// Check UnreachableSocketBehavior
switch b.UnreachableSocketBehavior {
case "":
b.UnreachableSocketBehavior = unreachableSocketBehaviorError
case unreachableSocketBehaviorError, unreachableSocketBehaviorIgnore:
// Valid options, do nothing
default:
return fmt.Errorf("unknown choice for unreachable_socket_behavior: %q", b.UnreachableSocketBehavior)
}

var err error
// Validate Socket path
if b.SocketPath, err = b.checkFilePath(b.SocketPath, socket); err != nil {
return fmt.Errorf("socket_path: %w", err)
}

// Validate log file path
if b.FileLogPath, err = b.checkFilePath(b.FileLogPath, log); err != nil {
return fmt.Errorf("log_file_path: %w", err)
}

// Create Log Connector
b.logConn = newLogConnector(b.FileLogPath, waitForTelemetryDuration)

// Create Socket Connector
b.sockConn = newSocketConnector(b.SocketPath, time.Duration(b.SocketAccessTimeout))
return nil
}

func (b *Baseband) Gather(acc telegraf.Accumulator) error {
err := b.sockConn.dumpTelemetryToLog()
if err != nil {
return err
}

// Read the log
err = b.logConn.readLogFile()
if err != nil {
return err
}

err = b.logConn.readNumVFs()
if err != nil {
return fmt.Errorf("couldn't get the number of VFs: %w", err)
}
// b.numVFs less than 0 means that we are reading the file for the first time (or occurred discontinuity in file availability)
if b.logConn.getNumVFs() <= 0 {
return errors.New("error in accessing information about the amount of VF")
}

// rawData eg: 12 0
if err = b.gatherVFMetric(acc, vfCodeBlocks); err != nil {
return fmt.Errorf("couldn't get %q metric: %w", vfCodeBlocks, err)
}

// rawData eg: 12 0
if err = b.gatherVFMetric(acc, vfDataBlock); err != nil {
return fmt.Errorf("couldn't get %q metric: %w", vfDataBlock, err)
}

// rawData eg: 12 0 0 0 0 0
if err = b.gatherEngineMetric(acc, engineBlock); err != nil {
return fmt.Errorf("couldn't get %q metric: %w", engineBlock, err)
}
return nil
}

func (b *Baseband) gatherVFMetric(acc telegraf.Accumulator, metricName string) error {
metrics, err := b.logConn.getMetrics(metricName)
if err != nil {
return fmt.Errorf("error accessing information about the metric %q: %w", metricName, err)
}

for _, metric := range metrics {
if len(metric.data) != b.logConn.getNumVFs() {
return fmt.Errorf("data is inconsistent, number of metrics in the file for %d VFs, the number of VFs read is %d",
len(metric.data), b.logConn.numVFs)
}

for i := range metric.data {
value, err := logMetricDataToValue(metric.data[i])
if err != nil {
return err
}

fields := map[string]interface{}{
"value": value,
}
tags := map[string]string{
"operation": metric.operationName,
"metric": metricNameToTagName(metricName),
"vf": fmt.Sprintf("%v", i),
}
acc.AddGauge(pluginName, fields, tags)
}
}
return nil
}

func (b *Baseband) gatherEngineMetric(acc telegraf.Accumulator, metricName string) error {
metrics, err := b.logConn.getMetrics(metricName)
if err != nil {
return fmt.Errorf("error in accessing information about the metric %q: %w", metricName, err)
}

for _, metric := range metrics {
for i := range metric.data {
value, err := logMetricDataToValue(metric.data[i])
if err != nil {
return err
}

fields := map[string]interface{}{
"value": value,
}
tags := map[string]string{
"operation": metric.operationName,
"metric": metricNameToTagName(metricName),
"engine": fmt.Sprintf("%v", i),
}
acc.AddGauge(pluginName, fields, tags)
}
}
return nil
}

// Validate the provided path and return the clean version of it
// if UnreachableSocketBehavior = error -> return error, otherwise ignore the error
func (b *Baseband) checkFilePath(path string, fileType fileType) (resultPath string, err error) {
if resultPath, err = validatePath(path, fileType); err != nil {
return "", err
}

if err = checkFile(path, fileType); err != nil {
if b.UnreachableSocketBehavior == unreachableSocketBehaviorError {
return "", err
}
b.Log.Warn(err)
}
return resultPath, nil
}

func newBaseband() *Baseband {
return &Baseband{
SocketAccessTimeout: defaultAccessSocketTimeout,
WaitForTelemetryTimeout: defaultWaitForTelemetryTimeout,
}
}

func init() {
inputs.Add("intel_baseband", func() telegraf.Input {
return newBaseband()
})
}
31 changes: 31 additions & 0 deletions plugins/inputs/intel_baseband/intel_baseband_notamd64linux.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
//go:generate ../../../tools/readme_config_includer/generator
//go:build !linux || !amd64

package intel_baseband

import (
_ "embed"

"github.com/influxdata/telegraf"
"github.com/influxdata/telegraf/plugins/inputs"
)

//go:embed sample.conf
var sampleConfig string

type Baseband struct {
Log telegraf.Logger `toml:"-"`
}

func (b *Baseband) Init() error {
b.Log.Warn("current platform is not supported")
return nil
}
func (*Baseband) SampleConfig() string { return sampleConfig }
func (*Baseband) Gather(_ telegraf.Accumulator) error { return nil }

func init() {
inputs.Add("intel_baseband", func() telegraf.Input {
return &Baseband{}
})
}
Loading