Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance comparison (plus Brotli vs Gzip compression) #37

Closed
proddy opened this issue Dec 24, 2023 · 18 comments
Closed

Performance comparison (plus Brotli vs Gzip compression) #37

proddy opened this issue Dec 24, 2023 · 18 comments

Comments

@proddy
Copy link

proddy commented Dec 24, 2023

More of a discussion than an project issue,

One thing I'm planning to play with is using Brotli instead of Gzip on HTTPS. It's apparently faster and would be great to use this library's performance tests to benchmark. I'm hoping it's as simple as just adding response.addHeader("Content-Encoding", "br") and compressing the files using zlib.brotliCompressSync() instead of zlib.gzipSync(().

I'll report back with some test results.

@zekageri
Copy link
Contributor

Well, my files already gzipped with webpack but it is interesting for sure!

@proddy proddy changed the title Brotli vs Gzip Brotli vs Gzip compression Dec 24, 2023
@proddy
Copy link
Author

proddy commented Dec 24, 2023

i have a huge amount of web pages (220KB), all stored in Flash mem, with 60+ endpoints. That's mainly because my app is translated to 8 languages. Brotli's compression is better so anything that saves on precious ESP32 mem is wonderful.

@zekageri
Copy link
Contributor

But do you compress it on the fly?

I also have a ridiculis amount of endpoints and my pages are also huge. I did not store separate pages for languages but i choosen a different approach. I have language json files for different pages stored in folders/page. This lets me store a single page and multiple lang files which i belive is a lot more resource friendly. I can also edit any lang file without modifying the html.

@zekageri
Copy link
Contributor

Btw my app is translated to 4 languages.

@proddy
Copy link
Author

proddy commented Dec 24, 2023

I use dynamic translations on the web side to 10 languages using typesafe-i18n. The backend code is also translated using a custom library, all in real-time. The web code is all reactJS/typescript with yarn/vite bundling. You can see a live demo at https://ems-esp.derbyshire.nl, albeit an older version.

I've been using AsyncTCP and AsyncWebServer for years and always wanted to move away from Arduino and go to native IDF. Using PshchicHttp is the first step and super excited. The MQTT part has already been migrated to espMqttClient.

@zekageri
Copy link
Contributor

Wow. So i assume it is an spa and thats why it is so fast. Does this demo run on an esp?

@proddy
Copy link
Author

proddy commented Dec 25, 2023

not on an ESP! I'm cheating and the demo is hosted on Cloudflare...using CF Pages for the web and CF Workers for the API backend which mimics the data from an ESP32. That's why it's so quick. Cloudflare is excellent, and free too.

If you're interested the code is in https://github.com/emsesp/EMS-ESP32. The PsyhicHttp port is in the https_36 branch which I'll upload in the next few days. Almost finished the port.

@proddy
Copy link
Author

proddy commented Dec 29, 2023

@hoeken I'd like to run some benchmarking too. Do you have automated way I can steal that converts all the log data from the autocannon scripts into a single /benchmark/comparison.ods file?

@hoeken
Copy link
Owner

hoeken commented Dec 29, 2023 via email

@proddy proddy changed the title Brotli vs Gzip compression Performance comparison (plus Brotli vs Gzip compression) Dec 30, 2023
@proddy
Copy link
Author

proddy commented Dec 30, 2023

I'm doing some A/B testing with two identical ESP32's, one running the same code with AsyncWebServer and the other PsyhicHttp. Then I used autocannon-ui (which has autocannon and a compare function integrated) to benchmark the two.

With autocannon I used a single connection, 1 worker/pipeline and 10 second duration (which are the defaults) to a single URI endpoint /api/system/command which returns a 260 byte JSON object.

image

Results:

#1 = AsyncWebServer
#2 = PsychicHttp

image

Initial observations:

  • Psychic seems to be 2-3x slower when returning JSON from a GET. The code is very simple and follows the examples in this repo using PsychicJsonResponse. I also notice this when typing in the URII into a browser, and analyzing the network response and timings via the browsers dev tools. Here the "waiting for server response" is much higher with PsychicHtttp.
  • During the load test, PsychicHttp makes 62 calls and returns Status Code 200 62 times. AsyncWebServer makes one call and returning a 200. I think this is because AsyncWebServer is closing on every response:

image

there is no close with PsychicHttp.

I'll keep digging, but any ideas/thoughts welcome.....

@hoeken
Copy link
Owner

hoeken commented Dec 30, 2023

@proddy great to see some more benchmarking. I have done pretty much zero work on optimization, so hopefully we have plenty of room to speed things up. That being said, I don't really know the toolchains, etc for doing code profiling so I would love some help. I would gladly accept PRs as well :)

One thing I noticed is that the Async test basically just makes 1 request and quits. The http headers + 260b response matches up with a total of 387b transferred. That is probably skewing things a bit. Is there a setting in autocannon to have it reconnect?

As for potential optimizations, it might be the url matching? you could set server.config.uri_match_fn = NULL; which would switch back to the basic strcmp() url matching. Its set to wildcard matching right now.

There's also probably room for optimization around the endpoint / client / handler "arrays". Right now its using std::list as it made things very easy to implement, but it seems its not the fastest option out there.

@proddy
Copy link
Author

proddy commented Dec 30, 2023

Thanks @hoeken for the response (no pun intended!). I know these are busy days.

I played with uri_match_fn, and forcefully closing each connection as AsyncWS does but it makes no difference. The response is still twice as slow, also for static HTML. The list of handler array code is fine and fast enough. I've done a lot of work with queues and home-built linked lists and always go back to std::list as it's just very solid and quick.

To rule out any quirks in my code I'll use your examples and run some benchmarking to see if the results are similar.

Also curious if others are seeing anything similar?

@hoeken
Copy link
Owner

hoeken commented Dec 30, 2023

Puns are definitely allowed, especially since the project name is a play on ESP. :)

If you didn't notice, there is code in the /benchmark directory for both psychichttp and espasync (and arduinomongoose). If you want to dig into the benchmark stuff or test your own code feel free to add to that, just try to keep code identical between the different sketches.

@hoeken
Copy link
Owner

hoeken commented Dec 30, 2023

I have this code in the wifi setup on the benchmarks, maybe see if that has any effect?

  WiFi.setSleep(false);
  WiFi.useStaticBuffers(true);

@proddy
Copy link
Author

proddy commented Dec 30, 2023

LOL, I had to google that. https://en.wikipedia.org/wiki/Psychic

I have seen your benchmark code for the various libs and will extend that and run some more stress tests. I really want PsychicHttp to knock AsyncWebServer and others out of the park.

@zekageri
Copy link
Contributor

Iam more of an observer right now because of the holidays but my webserver definietly feels slower and sometimes i dont get all the files loaded on the front end side. I played with the config object without success

@Chris--A
Copy link
Contributor

Chris--A commented Jan 2, 2024

With regards to the PsychicJsonResponse the latency might be attributed to the pre calculation of the output size:

size_t length = getLength();

This is to determine if it can be sent in a single go, or chunked. It might be worth removing the length check and testing with chunked send only. If the output ends up being smaller than the internal buffer, there is only one network send done anyway.

I'm testing other changes that may improve the PsychicStreamResponse I submitted a PR for (#45). The changes could potentially speed up PsychicJsonResponse also.

Additionally, the stream response will allow you do a chunked only send, with no length check, and a static JSON buffer:

server.on("/api/*", [](PsychicRequest *request) {

  StaticJsonDocument<512> doc;
  doc["success"] = true;

  if(request->url().endsWith("system")){
    doc["FreeHeap"] = ESP.getFreeHeap();
    doc["MinFreeHeap"] = ESP.getMinFreeHeap();
    doc["MaxAllocHeap"] = ESP.getMaxAllocHeap();
    doc["HeapSize"] = ESP.getHeapSize();
  }else{
    doc["success"] = false;
  }

  PsychicStreamResponse response(request, "application/json");
  response.beginSend();
  serializeJson(doc, response);
  return response.endSend();
});

@proddy
Copy link
Author

proddy commented Jan 2, 2024

Thanks @Chris--A - I'll look into optimizing Json using your PR.

I used the benchmark code to performance test PsyhicHttp against my heavily tweaked versions of AsyncWebServer and AsyncTCP and the test JsonResponse call to /api?foo=bar and also alien.png is 50-70% slower with PsychicHttp. Maybe I'm not comparing apples with apples here. I need to dig deeper into the IDF code.

--edit--

forcing thunking didn't make a difference. I'm starting to think the performance hit is not in PsychicHttp but in the async TCP lwip stuff.

Repository owner locked and limited conversation to collaborators Aug 10, 2024
@hoeken hoeken converted this issue into discussion #142 Aug 10, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants