Mem efficiency, make full use of client struct memory for reply buffers #8968

ShooterIT · 2021-05-20T08:09:20Z

When we allocate a client struct with 16k reply buffer, the allocator we may give us 20K,
This commit makes use of that extra space.
Additionally, it tries to store whatever it can from the reply into the static 'buf' before
allocating a new node for the reply list.

Suggested by @oranagra in #8966 (review)

Similar changes (can all be mentioned in the release notes together): #8975, #8966

oranagra · 2021-05-20T08:36:55Z

so there are 3 changes here:

make use of the internal fragmentation space of the client struct.
i'm not certain if the approach you took is the cleanest one. i imagined using an empty array [] in the struct and a specially crafted call to malloc (like we do for clientReplyBlock). i think your approach may actually be cleaner since other places that allocate a client struct and didn't add anything to malloc will still have something. but on the other hand, they don't currently init the new size variable you added, which may cause issues (i guess dead code currently).
maybe it's time to finally unify all the places that allocate a client struct and init all it's fields to a common function.
because of that change, there's a chance not all client struct buffers will be of the same size, and copyClientOutputBuffer become more complicated, however that new code there is likely to always be dead code, so it's untested.
maybe we can change it in a way more of it it'll be tested? like maybe write what we can into the initial static buf, and / or somehow re-use existing code in order to add a new reply list node?
this is actually unrelated to the change in this PR (right?), when attempting to write to the static client buffer, we write what we can if there's room for something, instead of just giving up and putting the while thing in the reply list.

yoav-steinberg

LGTM two minor comments.

src/networking.c

yoav-steinberg

Added some more minor thoughts.

src/networking.c

src/scripting.c

ShooterIT · 2021-05-21T09:52:49Z

Hi @oranagra

make use of the internal fragmentation space of the client struct.

I am not sure i understand you, firstly, AFAIK, static buf is necessary because we don't want to frequently allocate/free memory to reply clients.

For you said 2, I refactor it to make it easy to test.

3. this is actually unrelated to the change in this PR (right?)

I think it also belongs to this PR, because we do not use the buf when it has room but can't store one entire bulk. In redis protocol, we have many *x\r\n, $x\r\n, always putting reply into buf if possible can make a chance to save one clientReplyBlock. For example, there is 9k space in buf, we need to reply multi bulks and one of them is a 20k(one jemalloc size:)) bulk reply. Before this commit, we may create two clientReplyBlock, firstly, put *x\r\n into buf, and then we create big clientReplyBlock to store 20k bulk, then, we still need to create a 16k bulk to store \r\n or other small bulks. But now, we put 9k of 20K bulk reply into buf, and then only create one clientReplyBlock to store others.

yoav-steinberg · 2021-05-25T11:00:43Z

I think it also belongs to this PR, because we do not use the buf when it has room but can't store one entire bulk. In redis protocol, we have many *x\r\n, $x\r\n, always putting reply into buf if possible can make a chance to save one clientReplyBlock. For example, there is 9k space in buf, we need to reply multi bulks and one of them is a 20k(one jemalloc size:)) bulk reply. Before this commit, we may create two clientReplyBlock, firstly, put *x\r\n into buf, and then we create big clientReplyBlock to store 20k bulk, then, we still need to create a 16k bulk to store \r\n or other small bulks. But now, we put 9k of 20K bulk reply into buf, and then only create one clientReplyBlock to store others.

@ShooterIT now I see what you mean. buf is almost always used for the protocol headers ($,*, etc.) So two system calls are the norm... I guess, in this case, the optimization is worth it..

src/networking.c

oranagra · 2021-05-26T19:07:36Z

regarding the 1,2,3 in my previous comment:

i didn't mean we want to eliminate the static buffer.. it is very much needed in order to avoid mallocs and frees for small replies, and also to remain more cache friendly. i'll discuss that topic below.
change copyClientOutputBuffer so that it uses existing code and doesn't have a big pile of dead code. i see you found an elegant solution.
i thought this may be a topic for a different PR, but it is somewhat related, so we can keep it here.

what i meant was that instead of declaring the last member of the client struct a char buf[PROTO_REPLY_CHUNK_BYTES];, we can declare it as char buf[];, and then when we malloc the client struct, we do zmalloc(sizeof(client)+PROTO_REPLY_CHUNK_BYTES).
the result is similar. accessing c->buf[offset] would compute an offset from the client struct (there's no pointer dereference),

the implication of doing that change i suggested is that you can't allocate client structs on the stack, and that anywhere that allocates one has to be changed.
but in some way the approach you took also needs a change in any places that allocates a client struct, since you only initialized your new field in one allocation.
bottom line, i think we can keep your approach, but we do need to init the new field in the other places, or use this opportunity to refactor them all to use the same createClient

yoav-steinberg

LGTM.

src/networking.c

…rs (redis#8968) When we allocate a client struct with 16k reply buffer, the allocator we may give us 20K, This commit makes use of that extra space. Additionally, it tries to store whatever it can from the reply into the static 'buf' before allocating a new node for the reply list.

ShooterIT added 2 commits May 20, 2021 15:27

Make full use of 'buf' of client struct

f53330a

Remove test conditions

bdfbd9b

yoav-steinberg approved these changes May 20, 2021

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

src/networking.c Outdated Show resolved Hide resolved

yoav-steinberg previously approved these changes May 20, 2021

View reviewed changes

src/networking.c Show resolved Hide resolved

src/scripting.c Outdated Show resolved Hide resolved

Fix some CRs

d9f889b

ShooterIT dismissed yoav-steinberg’s stale review via d9f889b May 21, 2021 06:39

Unify CLIENT_CLOSE_AFTER_REPLY flag handling

d66134a

sundb reviewed May 26, 2021

View reviewed changes

src/networking.c Outdated Show resolved Hide resolved

oranagra added state:to-be-merged The PR should be merged soon, even if not yet ready, this is used so that it won't be forgotten release-notes indication that this issue needs to be mentioned in the release notes labels May 27, 2021

ShooterIT added 2 commits May 27, 2021 21:23

Init client field and fix typo

f709fc3

Minor improvement of reply chunk allocation

8921919

ShooterIT mentioned this pull request May 31, 2021

Make full use of aofrwblock's buf #8975

Merged

yoav-steinberg approved these changes Jun 8, 2021

View reviewed changes

src/networking.c Show resolved Hide resolved

oranagra approved these changes Jun 8, 2021

View reviewed changes

oranagra changed the title ~~Make full use of 'buf' of client struct~~ Mem efficiency, make full use of client struct memory for reply buffers Jun 8, 2021

oranagra merged commit c396fd9 into redis:unstable Jun 8, 2021

ShooterIT deleted the client-buf branch June 8, 2021 13:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mem efficiency, make full use of client struct memory for reply buffers #8968

Mem efficiency, make full use of client struct memory for reply buffers #8968

ShooterIT commented May 20, 2021 •

edited by oranagra

Loading

oranagra commented May 20, 2021

yoav-steinberg left a comment

yoav-steinberg left a comment

ShooterIT commented May 21, 2021 •

edited

Loading

yoav-steinberg commented May 25, 2021

oranagra commented May 26, 2021

yoav-steinberg left a comment

Mem efficiency, make full use of client struct memory for reply buffers #8968

Mem efficiency, make full use of client struct memory for reply buffers #8968

Conversation

ShooterIT commented May 20, 2021 • edited by oranagra Loading

oranagra commented May 20, 2021

yoav-steinberg left a comment

Choose a reason for hiding this comment

yoav-steinberg left a comment

Choose a reason for hiding this comment

ShooterIT commented May 21, 2021 • edited Loading

yoav-steinberg commented May 25, 2021

oranagra commented May 26, 2021

yoav-steinberg left a comment

Choose a reason for hiding this comment

ShooterIT commented May 20, 2021 •

edited by oranagra

Loading

ShooterIT commented May 21, 2021 •

edited

Loading