Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Map Leaking entries... #109

Closed
klauspost opened this issue Oct 27, 2023 · 4 comments · Fixed by #111
Closed

Map Leaking entries... #109

klauspost opened this issue Oct 27, 2023 · 4 comments · Fixed by #111
Labels
question Further information is requested

Comments

@klauspost
Copy link

Sorry about the non-specific, but I just spent 2 days chasing down a bug, which turned out to be a problem where entries are "leaked" from the maps.

I have tried making a standalone reproducer, but I have been unable to trace it down. Finally I just replaced it with a "dumb map[uint64]*value" with a mutex for operations. This doesn't leak entries.

It only leaks very rarely, and I've only seen it under high loads.

Init: xsync.NewIntegerMapOfPresized[uint64, *muxClient](1000) - value is never copied or anything similar.

Functions used inside the test:

Load(id uint64) (*muxClient, bool)
LoadAndDelete(id uint64) (*muxClient, bool)
LoadOrStore(id uint64, v *muxClient) (*muxClient, bool)
Delete(id uint64)

Naturally all operations are used on the map concurrently.

Functions not use while reproducing:

Range(fn func(key uint64, value *muxClient) bool)
Clear()
Size() int 

Observations:

  • All values are stored via LoadOrStore.
  • Loading the value with Load after the LoadOrStore completed shows the value exists.
  • Some time elapses while RPC processes.
  • When the response returns the entry is gone.
  • All delete operations are logged. None match the entry.

Sorry for the vagueness. I tried setting up a reproducer, but I was not able to do a standalone reproducer.

Using github.com/puzpuzpuz/xsync/v2 v2.5.0 - go version go1.21.0 windows/amd64.

@puzpuzpuz puzpuzpuz added the question Further information is requested label Oct 28, 2023
@puzpuzpuz
Copy link
Owner

Hi,

Am I right in understanding that a "leak" means spuriously disappearing values, e.g. as if the value was garbage-collected? Are you able to tell which uint64 ids were problematic? I'd like to check one thing.

@klauspost
Copy link
Author

klauspost commented Oct 28, 2023

Am I right in understanding that a "leak" means spuriously disappearing values, e.g. as if the value was garbage-collected?

Correct. The key+value just disappears from the map.

Are you able to tell which uint64 ids were problematic? I'd like to check one thing.

It is not consistent - but the benchmark has some random elements. The IDs are starting from 1 and increasing.

I have the logs from a run where ID 12844 disappears:
impossible.host.txt.gz

These are the entries for the entry that goes missing:

12844 Connection.newMuxClient: RELOADED MUX. loaded: false found: true
12844 SEND (ignore)
12844 INCOMING (serverside - ignore)
12844 RESPONDING (serverside - ignore)
12844 Got response for unknown mux (response back, client can no longer find 12844)
12844 Connection.Request: DELETING MUX. Exists: false
12844 ERR: context deadline exceeded
    benchmark_test.go:99: 12844 ERR: context deadline exceeded

The entry is stored like this - I added an extra load to see if it was actually stored:

		// Handle the extremely unlikely scenario that we wrapped.
		if _, loaded := c.outgoing.LoadOrStore(client.MuxID, client); client.MuxID != 0 && !loaded {
			if debugReqs {
				_, found := c.outgoing.Load(client.MuxID)
				fmt.Println(client.MuxID, c.String(), "Connection.newMuxClient: RELOADED MUX. loaded:", loaded, "found:", found)
			}
			return client, nil
		}
		(retry with a different ID omitted)

The entry is loaded like this:

	v, ok := c.outgoing.Load(m.MuxID)
	if !ok {
		if debugReqs {
			fmt.Println(m.MuxID, c.String(), "Got response for unknown mux")
		}
		return
	}

The delete also adds a Load to see if what we are deleting exists:

		if debugReqs {
			_, ok := c.outgoing.Load(client.MuxID)
			fmt.Println(client.MuxID, c.String(), "Connection.Request: DELETING MUX. Exists:", ok)
		}
		c.outgoing.Delete(client.MuxID)

The entries are created pretty much in order, but deletes are somewhat less ordered.

@puzpuzpuz
Copy link
Owner

Good news. Looks like I'm able to reproduce the issue. Looking into the bug now: the bugfix may take some time, so please be patient. Once more, thanks a lot for reporting the issue!

@klauspost
Copy link
Author

Thanks. I am not looking for a quick fix, since the workaround works fine and it isn't really critical to the current code path, as the CPU usage of other parts are much higher.

Just thought it might be helpful to report. And TBH if someone had come with a similar claim with one of my packages, I'd 90% have dismissed it as "problem is somewhere in your code".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants