Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve thread safety of fingerprint processor #18738

Merged
merged 3 commits into from
Jun 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.next.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ https://github.com/elastic/beats/compare/v7.0.0-alpha2...master[Check the HEAD d
- Fix `keystore add` hanging under Windows. {issue}18649[18649] {pull}18654[18654]
- Fix an issue where error messages are not accurate in mapstriface. {issue}18662[18662] {pull}18663[18663]
- Fix regression in `add_kubernetes_metadata`, so configured `indexers` and `matchers` are used if defaults are not disabled. {issue}18481[18481] {pull}18818[18818]
- Fix potential race condition in fingerprint processor. {pull}18738[18738]

*Auditbeat*

Expand Down
15 changes: 8 additions & 7 deletions libbeat/processors/fingerprint/fingerprint.go
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ package fingerprint

import (
"fmt"
"hash"
"io"
"time"

Expand All @@ -39,7 +38,7 @@ const processorName = "fingerprint"
type fingerprint struct {
config Config
fields []string
hash hash.Hash
hash hashMethod
}

// New constructs a new fingerprint processor.
Expand All @@ -49,21 +48,23 @@ func New(cfg *common.Config) (processors.Processor, error) {
return nil, makeErrConfigUnpack(err)
}

fields := common.MakeStringSet(config.Fields...)
// The fields array must be sorted, to guarantee that we always
// get the same hash for a similar set of configured keys.
// The call `ToSlice` always returns a sorted slice.
fields := common.MakeStringSet(config.Fields...).ToSlice()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, is there a difference in calling .ToSlice() here vs. how it was before, a few lines below? Either way the value of p.fields would be sorted, no? I'm just wondering if you are making this change to try and do all the operations on config.Fields in one place (this line) or if there's some other more fundamental reason that could prevent a bug or something else.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just moved it here in order to 'cleanly' add the code comment. Operationally it makes no difference. It's not obvious that ToSlice already sorts and not immediately obvious that we always need the keys to be sorted, so we should document this (at first I thought we have a bug thanks to StringSet storing keys in randomized order).

Often I prefer to have fully 'initialized' variables that I don't need to modify or execute other operations on when constructing another object. This gives each code block a self-contained purpose (almost like a function).


p := &fingerprint{
config: config,
hash: config.Method(),
fields: fields.ToSlice(),
hash: config.Method,
fields: fields,
}

return p, nil
}

// Run enriches the given event with fingerprint information
func (p *fingerprint) Run(event *beat.Event) (*beat.Event, error) {
hashFn := p.hash
hashFn.Reset()
hashFn := p.hash()

err := p.writeFields(hashFn, event.Fields)
if err != nil {
Expand Down
50 changes: 49 additions & 1 deletion libbeat/processors/fingerprint/fingerprint_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,56 @@ import (
"time"

"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"

"github.com/elastic/beats/v7/libbeat/beat"
"github.com/elastic/beats/v7/libbeat/common"
)

func TestWithConfig(t *testing.T) {
cases := map[string]struct {
config common.MapStr
input common.MapStr
want string
}{
"hello world": {
config: common.MapStr{
"fields": []string{"message"},
},
input: common.MapStr{
"message": "hello world",
},
want: "50110bbfc1757f21caacc966b33f5ea2235c4176739447e0b3285dec4e1dd2a4",
},
"with string escaping": {
config: common.MapStr{
"fields": []string{"message"},
},
input: common.MapStr{
"message": `test message "hello world"`,
},
want: "14a0364b79acbe4c78dd5e77db2c93ae8c750518b32581927d50b3eef407184e",
},
}

for name, test := range cases {
t.Run(name, func(t *testing.T) {
config := common.MustNewConfigFrom(test.config)
p, err := New(config)
require.NoError(t, err)

testEvent := &beat.Event{
Timestamp: time.Now(),
Fields: test.input.Clone(),
}
newEvent, err := p.Run(testEvent)
v, err := newEvent.GetValue("fingerprint")
assert.NoError(t, err)
assert.Equal(t, test.want, v)
})
}
}

func TestHashMethods(t *testing.T) {
testFields := common.MapStr{
"field1": "foo",
Expand Down Expand Up @@ -393,7 +438,10 @@ func BenchmarkHashMethods(b *testing.B) {
b.Run(method, func(b *testing.B) {
b.ResetTimer()
for _, e := range events {
p.Run(&e)
_, err := p.Run(&e)
if err != nil {
b.Fatal(err)
}
}
})
}
Expand Down