-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of Discover with large fields #11457
Comments
some additional details: I am using chrome latest on a macbook pro newish. Is there a non-minified version of kibana I can download to test? Right now the chrome profile shows everything in commons.bundle.js, so isn't very helpful. :) Chances are my specific instance has a few of these larger docs showing up on discover, causing the extra slowness per full page load. If you want to know specifically, my examples of big messages tend to be giant SQL queries with MB-worth of csv ids: DELETE FROM foo WHERE id IN ( 12345,54331,968574,.... ) or similar |
@msporleder-work best way to get the un-minified source would be to clone the repo from github and start up Kibana in dev mode with I'll try indexing lots of large docs tomorrow and see how that affects things on my machine. |
Adding more docs (unsurprisingly) slows things down, probably linearly. With 50 docs (100MB total) Discover took at least 5 minutes to load I think, I stopped watching at one point. I'm surprised it didn't crash. I don't think there will be any quick fix for this amount of data. @weltenwort Something to think about as you ponder the doc table refactor. @msporleder-work can you tell us a bit more about your use case? Do you need to see those giant fields in their entirety? Do you just need to search on them? I'm trying to think of other ways you could accomplish your goals. |
I can probably accomplish my goals and keep stability by figuring out a way for logstash to truncate the fields to < 256k. If anyone is interested my use case for these giant entries are streaming in mysql's slow.log, one entry per sql. This let's me quickly count slow queries per server/cluster and point analysts/devs to a kibana query host:"^warehouse" AND source:"slow.log" (or whatever) and get a nice list of all the queries we need to fix. For whatever reason our queries tend to get big. |
@Bargs thanks for including such comprehensive instructions to reproduce the effect. I will try to diagnose whether the bottleneck is the loading/processing or the rendering - I suspect all of the above 😉 Based on my intuition I would say we want to consider the following improvements:
I would be very motivated to tackle those as soon as I have completed the next stage of the context view (I've already started on the react/redux aspect on the side). |
As an aside, we need to make sure to communicate with @kobelb about this because he may be using the available field list in csv export. |
@Bargs thanks for looping me in on this! Obviously, I don't want for the needs of any sharing integration to impose limitations on how you guys implement Discover. I was currently planning on utilizing the available fields, currently available on |
Actually, I was not quite correct: the field caps api does not provide the set of available columns. But the point still stands: We should get the information in a more scalable way than iterating client-side. @kobelb I agree, sounds like that would be a subset of what discover needs anyway. |
@weltenwort yeah... it's not exactly what we want, but it was the closest that @Bargs and I were able to determine. Based on the order that the results are returned, it's possible that certain columns could be missed. |
Adding a |
Just noticed this behavior -- Kibana loads all data into the client for the first 500 entries regardless of whether or not the columns are currently showing in the table. If I have a field with a 1MB value this causes a 500MB request even though its not showing on the table. Would love Kibana to only load the fields that are displayed and lazy load the documents on expansion:
|
Kibana version: 5.3.1
Elasticsearch version: 5.3.1
Description of the problem including expected versus actual behavior:
Based on conversation here #7755 (comment)
Documents with large fields cause sluggishness in various parts of the Discover UI. The initial render is slow, opening and closing individual documents is slow, and switching between Table/JSON tabs is slow. There are probably other areas that are slow too. #9014 improved things quite a bit, prior to that PR Discover couldn't even load a 1MB doc without crashing the browser. However, we should still try to improve things further.
Steps to reproduce:
Here's a demonstration of what happens on my machine when I load a doc with a 2.3MB message field.
The text was updated successfully, but these errors were encountered: