-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug][Go]: Metrics incremented in Setup methods are not recalled. #27038
Comments
Hi @lostluck, I am interested in this issue and have been exploring Apache Beam metrics before working on this issue regarding unrecalled metric increments in I observed that no custom metrics are reported in the Flink runner—only internal Beam metrics appear in the Any insights on this? For reference, here is the simple program I used: package main
import (
"context"
"flag"
"fmt"
"log"
"reflect"
"strings"
"github.com/apache/beam/sdks/v2/go/pkg/beam"
"github.com/apache/beam/sdks/v2/go/pkg/beam/core/metrics"
"github.com/apache/beam/sdks/v2/go/pkg/beam/runners/flink"
)
// Define metrics.
var (
processedCount = metrics.NewCounter("example", "processed_count")
invalidCount = metrics.NewCounter("example", "invalid_count")
sizeDistribution = metrics.NewDistribution("example", "record_size")
)
// UppercaseFn is a DoFn that processes words and updates metrics.
type UppercaseFn struct{}
func (fn *UppercaseFn) ProcessElement(word string, emit func(string)) {
processedCount.Inc(context.Background(), 1)
sizeDistribution.Update(context.Background(), int64(len(word)))
if word == "" {
invalidCount.Inc(context.Background(), 1)
return
}
emit(strings.ToUpper(word))
}
// PrintFn is a DoFn that prints each element.
type PrintFn struct{}
func (fn *PrintFn) ProcessElement(word string) {
fmt.Println(word)
}
func init() {
beam.RegisterType(reflect.TypeOf((*UppercaseFn)(nil)).Elem())
beam.RegisterType(reflect.TypeOf((*PrintFn)(nil)).Elem())
}
func main() {
flag.Parse()
beam.Init()
pipeline := beam.NewPipeline()
scope := pipeline.Root()
inputData := []string{"apple", "banana", "", "cherry"}
words := beam.CreateList(scope, inputData)
uppercaseWords := beam.ParDo(scope, &UppercaseFn{}, words)
beam.ParDo0(scope, &PrintFn{}, uppercaseWords)
result, err := flink.Execute(context.Background(), pipeline)
if err != nil {
log.Fatalf("Failed to execute pipeline: %v", err)
}
metrics := result.Metrics().AllMetrics()
for _, counter := range metrics.Counters() {
fmt.Printf("Counter %s: %d\n", counter.Name(), counter.Committed)
}
for _, distribution := range metrics.Distributions() {
fmt.Printf("Distribution %s: min=%d, max=%d, sum=%d, mean=%d, count=%d\n",
distribution.Name(),
distribution.Committed.Min,
distribution.Committed.Max,
distribution.Committed.Sum,
distribution.Committed.Sum/distribution.Committed.Count,
distribution.Committed.Count)
}
} Looking forward to your thoughts! 🙏 |
Sorry for missing this earlier @mohamedawnallah . If Flink doesn't support exporting user metrics, then it's not something the Go SDK can fix by itself. IIRC this issue is strictly for when a DoFn has metrics, and increments them in in the Once Per DoFn Instance Setup() lifecycle method. The provided code would not verify this issue by itself since it has no Setup methods on its dofns. The default Go SDK runner, Prism, should be able to validate this issue though, if the base code reveals it. The changes would be in the |
What happened?
Because the ParDo Setup context is uncached metrics initialized in Setup are lost, which is unexpected. Work done in setup, while logically outside of a bundle, will be under the context of the first bundle to execute that transform.
So there needs to be a way to transfer/extract the metrics from the Setup context so they are recorded back to the runner.
Issue Priority
Priority: 3 (minor)
Issue Components
The text was updated successfully, but these errors were encountered: