Improve performance of Discover with large fields #11457

Bargs · 2017-04-26T22:20:28Z

Kibana version: 5.3.1

Elasticsearch version: 5.3.1

Description of the problem including expected versus actual behavior:

Based on conversation here #7755 (comment)

Documents with large fields cause sluggishness in various parts of the Discover UI. The initial render is slow, opening and closing individual documents is slow, and switching between Table/JSON tabs is slow. There are probably other areas that are slow too. #9014 improved things quite a bit, prior to that PR Discover couldn't even load a 1MB doc without crashing the browser. However, we should still try to improve things further.

Steps to reproduce:

Create a doc with a big field. The following js script may help:

import { writeFileSync } from 'fs';

let output = '';
for (let i = 0; i < 400000; i++) {
  output += i.toString();
}

writeFileSync('path/to/file.json', JSON.stringify({message: output}));

Index the doc:

curl -XPOST localhost:9200/bigtest/bigtest -d @file.json -H Content-Type:Application/JSON

Create the index pattern, go play with Discover

Here's a demonstration of what happens on my machine when I load a doc with a 2.3MB message field.

The text was updated successfully, but these errors were encountered:

msporleder-work · 2017-04-27T00:01:18Z

some additional details: I am using chrome latest on a macbook pro newish.

Is there a non-minified version of kibana I can download to test? Right now the chrome profile shows everything in commons.bundle.js, so isn't very helpful. :)

Chances are my specific instance has a few of these larger docs showing up on discover, causing the extra slowness per full page load.

If you want to know specifically, my examples of big messages tend to be giant SQL queries with MB-worth of csv ids: DELETE FROM foo WHERE id IN ( 12345,54331,968574,.... ) or similar

Bargs · 2017-04-27T00:10:23Z

@msporleder-work best way to get the un-minified source would be to clone the repo from github and start up Kibana in dev mode with npm start, which will generate sourcemaps for you.

I'll try indexing lots of large docs tomorrow and see how that affects things on my machine.

Bargs · 2017-04-28T19:25:22Z

Adding more docs (unsurprisingly) slows things down, probably linearly. With 50 docs (100MB total) Discover took at least 5 minutes to load I think, I stopped watching at one point. I'm surprised it didn't crash.

I don't think there will be any quick fix for this amount of data. @weltenwort Something to think about as you ponder the doc table refactor.

@msporleder-work can you tell us a bit more about your use case? Do you need to see those giant fields in their entirety? Do you just need to search on them? I'm trying to think of other ways you could accomplish your goals.

msporleder-work · 2017-05-01T14:02:04Z

I can probably accomplish my goals and keep stability by figuring out a way for logstash to truncate the fields to < 256k.

If anyone is interested my use case for these giant entries are streaming in mysql's slow.log, one entry per sql. This let's me quickly count slow queries per server/cluster and point analysts/devs to a kibana query host:"^warehouse" AND source:"slow.log" (or whatever) and get a nice list of all the queries we need to fix. For whatever reason our queries tend to get big.

weltenwort · 2017-05-02T10:26:35Z

@Bargs thanks for including such comprehensive instructions to reproduce the effect. I will try to diagnose whether the bottleneck is the loading/processing or the rendering - I suspect all of the above 😉

Based on my intuition I would say we want to consider the following improvements:

loading: load only the fields currently displayed, lazy load the documents on expansion
processing: determine the field list via some api call instead of iterating over all fields and their values client-side [edited for correctness]
rendering: avoid unnecessary re-rendering using react, redux and memoization

I would be very motivated to tackle those as soon as I have completed the next stage of the context view (I've already started on the react/redux aspect on the side).

Bargs · 2017-05-02T15:12:27Z

processing: determine the field list via the new field caps api instead of iterating over all fields and their values

As an aside, we need to make sure to communicate with @kobelb about this because he may be using the available field list in csv export.

kobelb · 2017-05-02T15:59:23Z

@Bargs thanks for looping me in on this! Obviously, I don't want for the needs of any sharing integration to impose limitations on how you guys implement Discover. I was currently planning on utilizing the available fields, currently available on $scope.fieldCounts to determine all of the columns that the user wishes to share when referring to the underlying data. As long as the field caps api gives us the same information, I don't foresee this causing any issues.

weltenwort · 2017-05-02T16:06:04Z

Actually, I was not quite correct: the field caps api does not provide the set of available columns. But the point still stands: We should get the information in a more scalable way than iterating client-side.

@kobelb I agree, sounds like that would be a subset of what discover needs anyway.

kobelb · 2017-05-02T16:25:05Z

@weltenwort yeah... it's not exactly what we want, but it was the closest that @Bargs and I were able to determine. Based on the order that the results are returned, it's possible that certain columns could be missed.

jbudz · 2018-06-28T15:59:10Z

Adding a raw field formatter that skips the whole processing chain may be helpful. I haven't done any investigation but the thought occurred to me.

strawgate · 2019-11-08T15:43:55Z

Just noticed this behavior -- Kibana loads all data into the client for the first 500 entries regardless of whether or not the columns are currently showing in the table. If I have a field with a 1MB value this causes a 500MB request even though its not showing on the table.

Would love Kibana to only load the fields that are displayed and lazy load the documents on expansion:

loading: load only the fields currently displayed, lazy load the documents on expansion

processing: determine the field list via some api call instead of iterating over all fields and their values client-side [edited for correctness]

rendering: avoid unnecessary re-rendering using react, redux and memoization

timroes · 2021-06-01T11:16:23Z

Closing this in favor of #98497 and #101041

Bargs added :Discovery release_note:enhancement labels Apr 26, 2017

Bargs mentioned this issue Apr 26, 2017

Kibana 4 extremely slow displaying big text field #7755

Closed

Bargs added the Feature:Discover Discover Application label Apr 28, 2017

Bargs mentioned this issue Jul 5, 2017

Poor document table rendering performance #12665

Closed

2 tasks

Bargs mentioned this issue Sep 22, 2017

Kibana in Chrome doesn't degrade gracefully when there is an issue with returned data #14134

Closed

timroes added Team:Visualizations Visualization editors, elastic-charts and infrastructure and removed Team:Visualizations Visualization editors, elastic-charts and infrastructure :Discovery labels Sep 16, 2018

liza-mae added the performance label Mar 5, 2020

timroes mentioned this issue Apr 27, 2021

Truncate long fields in Discover server-side #98497

Closed

karlseguin mentioned this issue May 17, 2021

Allow truncation of fields in search elastic/elasticsearch#72453

Open

timroes closed this as completed Jun 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance of Discover with large fields #11457

Improve performance of Discover with large fields #11457

Bargs commented Apr 26, 2017

msporleder-work commented Apr 27, 2017

Bargs commented Apr 27, 2017

Bargs commented Apr 28, 2017

msporleder-work commented May 1, 2017

weltenwort commented May 2, 2017 •

edited

Loading

Bargs commented May 2, 2017

kobelb commented May 2, 2017

weltenwort commented May 2, 2017

kobelb commented May 2, 2017

jbudz commented Jun 28, 2018

strawgate commented Nov 8, 2019

timroes commented Jun 1, 2021

Improve performance of Discover with large fields #11457

Improve performance of Discover with large fields #11457

Comments

Bargs commented Apr 26, 2017

msporleder-work commented Apr 27, 2017

Bargs commented Apr 27, 2017

Bargs commented Apr 28, 2017

msporleder-work commented May 1, 2017

weltenwort commented May 2, 2017 • edited Loading

Bargs commented May 2, 2017

kobelb commented May 2, 2017

weltenwort commented May 2, 2017

kobelb commented May 2, 2017

jbudz commented Jun 28, 2018

strawgate commented Nov 8, 2019

timroes commented Jun 1, 2021

weltenwort commented May 2, 2017 •

edited

Loading