-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: various performance improvements #10558
Conversation
exports.chunkExpression = /chunk/i; | ||
exports.continueExpression = /100-continue/i; | ||
exports.chunkExpression = /(?:^|\W)chunked(?:$|\W)/i; | ||
exports.continueExpression = /(?:^|\W)100-continue(?:$|\W)/i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why those changes? are they faster, or safer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more explicit and therefore at least a little safer and it's consistent with the other regexps we use in _http_server.js when searching header values that can be comma-separated lists.
I did not measure the performance of this particular change on its own.
this.corkedRequestsFree = new CorkedRequest(this); | ||
var corkReq = { next: null, entry: null, finish: undefined }; | ||
corkReq.finish = onCorkedFinish.bind(undefined, corkReq, this); | ||
this.corkedRequestsFree = corkReq; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this giving any perf increase? The major benefit is to avoid the creation of CorkedRequest
, which was introduced to leverage hidden classes and the like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is similar to what was done awhile back for process.nextTick()
request objects, changed from new Foo()
to plain object.
encoding = 'buffer'; | ||
chunk = newChunk; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clever! nice catch!
encoding: encoding, | ||
callback: cb, | ||
next: null | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can probably reduce a bit the code footprint here by using:
{ chunk, encoding, callback: cb, next: null }
Not sure if this functions was inlineable before and/or this change remove that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but wasn't sure if that would cause compatibility problems for readable-stream
users or if the babel transforms took care of that or what...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so, babel can/will take care of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed.
@@ -113,7 +106,9 @@ function WritableState(options, stream) { | |||
|
|||
// allocate the first CorkedRequest, there is always | |||
// one allocated and free to use, and we maintain at most two | |||
this.corkedRequestsFree = new CorkedRequest(this); | |||
var corkReq = { next: null, entry: null, finish: undefined }; | |||
corkReq.finish = onCorkedFinish.bind(undefined, corkReq, this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please update the build script in readable-stream
when this lands too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -129,8 +129,12 @@ function ClientRequest(options, cb) { | |||
self._storeHeader(self.method + ' ' + self.path + ' HTTP/1.1\r\n', | |||
options.headers); | |||
} else if (self.getHeader('expect')) { | |||
if (self._header) { | |||
throw new Error('Can\'t render headers after they are sent to the ' + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This additional throw probably makes this PR semver-major?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't, I just had to move the check from the old _renderHeaders()
to the appropriate call sites since that function no longer exists. I tried just moving it to _storeHeader()
but that caused problems IIRC.
@@ -218,8 +222,11 @@ ClientRequest.prototype._finish = function _finish() { | |||
}; | |||
|
|||
ClientRequest.prototype._implicitHeader = function _implicitHeader() { | |||
if (this._header) { | |||
throw new Error('Can\'t render headers after they are sent to the client'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As does this one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It shouldn't, I just had to move the check from the old _renderHeaders()
to the appropriate call sites since that function no longer exists. I tried just moving it to _storeHeader()
but that caused problems IIRC.
var low = false; | ||
while (true) { | ||
switch (field) { | ||
case 'Content-Type': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I surprisingly see Content-type
and Content-length
in the wild a ton. It may be useful to add those as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to stop somewhere, plus you can only have a switch statement so large before the function gets deopt'ed (I think this may still be the case anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it not faster (and more elegant imho) to have a lookup object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my testing it isn't faster, whether with a plain object or Map
.
field = entry[0]; | ||
value = entry[1]; | ||
|
||
if (value instanceof Array) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it safe to use instanceof here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be, since we're the ones storing header values in this._headers
. That's also the reason why we can use for-in instead of Object.keys()
(slight speedup there) when iterating over this._headers
.
@@ -388,7 +419,10 @@ OutgoingMessage.prototype.getHeader = function getHeader(name) { | |||
|
|||
if (!this._headers) return; | |||
|
|||
return this._headers[name.toLowerCase()]; | |||
var entry = this._headers[name.toLowerCase()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we not use similar logic as in matchKnownFields
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't recall specifically if I tested it in here or not, but in some places such functions didn't seem to perform as well for whatever reason.
if (obj) { | ||
var keys = Object.keys(obj); | ||
for (var i = 0; i < keys.length; i++) { | ||
var k = keys[i]; | ||
k = keys[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
did this change actually make a perf impact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It wasn't about performance but easier readability and because I check if the loop was entered below by checking k
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah ok, that makes sense
var fixed = 'C'.repeat(20 * 1024); | ||
var storedBytes = Object.create(null); | ||
var storedBuffer = Object.create(null); | ||
var storedUnicode = Object.create(null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can these be Map
s instead? Also, let
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK Map
is still slow and http benchmarks already take a very long time as it is.
As far as converting to ES6 goes, that's for a separate PR I think. All I was doing in this part here was the changing style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK
Map
is still slow and http benchmarks already take a very long time as it is.
That is why we have https://github.com/nodejs/node/blob/master/benchmark/es/map-bench.js - tl;dr, it's faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Map
performance definitely appears to have improved significantly as of late.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, but I seem to recall some issue with that benchmark. I think there was discussion in some issue/PR awhile ago about it?
@mscdex ... looks like this needs a rebase |
/cc @nodejs/ctc I'm now marking this as semver-major because I just realized I forgot to run citgm and I just found out that there are apparently users (at least on github) that are using So far the only instance on citgm where I ran into this is @ChALkeR Could you find out how many other modules on npm may be affected? I really apologize for all of this.... |
I've now submitted a PR that should help resolve the main use cases for end users accessing Since IIRC we don't have an official policy on undocumented, "private" (underscore-prefixed) properties yet, it would be good to hear what others think should be the way forward on this. Some ideas:
As far as the now removed |
considering that |
PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
By enforcing the statusCode to be an SMI, it helps a bit performance-wise when looking up the associated statusMessage. PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
This commit implements two optimizations when working with headers: * Avoid having to explicitly "render" headers and separately store the original casing for header names. * Match special header names using a single regular expression instead of testing one regular expression per header name. PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
Without this, the http benchmarker would report "strange output" if wrk or any other http client reported 0 requests per second, which is a valid result. PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
This is similar to a change made awhile back for storing process.nextTick() requests. PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
PR-URL: nodejs#10558 Reviewed-By: Matteo Collina <matteo.collina@gmail.com> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Evan Lucas <evanlucas@me.com>
Any feedback from additional @nodejs/ctc members? Especially on any of the solutions I've proposed? |
That sounds like a nice enough solution. Warn in v8.x and remove in v9.x? |
So to give a little detail here from Express side, the @mscdex made some PRs and we are working to iron out some edge cases, but when Express does release a version that does not reference My thoughts are that if 8.x is going to become the next LTS, it would be really nice if the warning / removal started in a non-LTS version based on the rate it seems people upgrade Express. From the past, even a warning will end up having a lot of fallout from confused users all over the place, so just brainstorming how can we have the best chance for the change to go out unnoticed by the vast majority of the users. |
Alright, I have proposed a getter-based solution in #10941. |
I am still +1 for reverting fwiw. I think we need to investigate better how this impacts the ecosystem especially considering that citgm is broken with this change... |
@evanlucas If at least one other @nodejs/ctc member LGTM's #10941 then it shouldn't be an issue anymore. |
These commits bring 3-11% performance increase (this is on top of the performance improvements made recently in #10445, #10443, and #6533). The performance increases are greater when you start setting/adding more headers than the http benchmarks currently use (just 2 currently).
When using the 'simple' http benchmark, I had to reduce the http client benchmarker duration by half (5 seconds) and reduce the number of runs to 10 (from the default of 30) to get the results back in a reasonable amount of time, especially with the newly added benchmark parameter. This allowed me to finish benchmarking in a little over 4.5 hours, whereas before with the original duration and 30 runs it was still running after 18 hours (when it technically should have finished -- even taking into account overhead since a single run doesn't last exactly the duration specified).
With that being said, here are the results with those modified benchmarking settings:
When adding 3 custom headers in the 'setHeaderWH' case for example, you can see a greater increase:
/cc @nodejs/http
/cc @nodejs/streams for stream changes
CI: https://ci.nodejs.org/job/node-test-pull-request/5651/CI: https://ci.nodejs.org/job/node-test-pull-request/5652/
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passesAffected core subsystem(s)