Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database performance? #171

Closed
Alystrasz opened this issue Jan 7, 2020 · 19 comments
Closed

Database performance? #171

Alystrasz opened this issue Jan 7, 2020 · 19 comments

Comments

@Alystrasz
Copy link

Hello guys,

Pushing lots of Location entities on my gost instance (a bit more than 2 million), the system starts to behave strangely:

  • some of the requests run normally;
  • others take ages, or even throw 502/504 errors.
# Example from my instance (running the Docker image)
http://indoor-analytics.lille.inria.fr/v1.0/Things(1129)/Locations [resolves in 15ms]
http://indoor-analytics.lille.inria.fr/v1.0/Things(1130)/Locations [resolves in ~17s]

Maybe related to #152 ?
Cheers!

@rouvoy
Copy link

rouvoy commented Jan 17, 2020

This issue is critical for us and question the relevance of GOST for our projet, do you have any insight on the underlying reason?

@bertt
Copy link
Contributor

bertt commented Jan 17, 2020

Hi, whats the difference between Things(1130)/Locations and Things(1129)/Locations?

@Alystrasz
Copy link
Author

Hi,
Not much about the entities themselves, except the request time to get them:

  • Things(1129)/Locations takes 15 milliseconds to resolve;
  • Things(1130)/Locations takes 17 seconds to resolve.

@bertt
Copy link
Contributor

bertt commented Jan 17, 2020

but I guess 1130 has many (historicall)locations (how many?)

@Alystrasz
Copy link
Author

@bertt
Copy link
Contributor

bertt commented Jan 17, 2020

one way to investigate is to get sql database access and reproduce the 1130 behaviour (using a query containing thing, location, historicallocation). Maybe an index is missing somewhere.

@bertt
Copy link
Contributor

bertt commented Jan 20, 2020

I've added a document to get to the performed sql queries quickly, see https://github.com/gost/docs/blob/master/gost_debug_sql_queries.md

So for the query 'Things(1)/Locations' the query is:

SELECT A_location.location_id AS A_location_id, A_location.location_name AS A_location_name, A_location.location_description AS A_location_description, A_location.location_encodingtype AS A_location_encodingtype, A_location.location_geojson AS A_location_geojson FROM (SELECT location.id AS location_id, location.name AS location_name, location.description AS location_description, location.encodingtype AS location_encodingtype, location.geojson::text AS location_geojson FROM v1.location WHERE (SELECT thing.id AS thing_id FROM v1.thing INNER JOIN v1.thing_to_location ON thing.id = thing_to_location.thing_id AND location.id = thing_to_location.location_id WHERE thing.id = 1) IS NOT NULL ORDER BY location_id DESC LIMIT 2 OFFSET 0) AS A_location

@Alystrasz
Copy link
Author

Thanks for the GOST_LOG_VERBOSE_FLAG exposition, saves us some time!

The query on the database for http://indoor-analytics.lille.inria.fr/v1.0/Things(1130)/Locations is :

SELECT A_location.location_id AS A_location_id, 
               A_location.location_name AS A_location_name, 
               A_location.location_description AS A_location_description, 
               A_location.location_encodingtype AS A_location_encodingtype, 
               A_location.location_geojson AS A_location_geojson 
               FROM (SELECT location.id AS location_id, 
                                           location.name AS location_name, 
                                           location.description AS location_description, 
                                           location.encodingtype AS location_encodingtype, 
                                           location.geojson::text AS location_geojson 
                                           FROM v1.location  
                                           WHERE (SELECT thing.id AS thing_id FROM v1.thing 
                                                          INNER JOIN v1.thing_to_location 
                                                          ON thing.id = thing_to_location.thing_id AND location.id = thing_to_location.location_id 
                                                          WHERE thing.id = 1130) 
                                           IS NOT NULL 
                                           ORDER BY location_id DESC 
                                           LIMIT 2 OFFSET 0
               ) AS A_location

Here's the output for the explain analyse on this request :

                                                                         QUERY PLAN                                                                         
------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..26.18 rows=2 width=62) (actual time=0.284..19443.086 rows=1 loops=1)
   ->  Index Scan Backward using location_pkey on location  (cost=0.43..29655620.18 rows=2303788 width=62) (actual time=0.283..19443.083 rows=1 loops=1)
         Filter: ((SubPlan 1) IS NOT NULL)
         Rows Removed by Filter: 2316096
         SubPlan 1
           ->  Nested Loop  (cost=0.70..12.75 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=2316097)
                 ->  Index Only Scan using thing_pkey on thing  (cost=0.27..4.29 rows=1 width=8) (actual time=0.002..0.003 rows=1 loops=2316097)
                       Index Cond: (id = 1130)
                       Heap Fetches: 0
                 ->  Index Scan using fki_location_1 on thing_to_location  (cost=0.43..8.45 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=2316097)
                       Index Cond: (location.id = location_id)
                       Filter: (thing_id = 1130)
                       Rows Removed by Filter: 1
 Planning time: 0.489 ms
 Execution time: 19443.163 ms
(15 lignes)

We believe the problem hides within the nested loop; maybe an index issue (regarding location.id = location_id) ?

@rouvoy
Copy link

rouvoy commented Jan 21, 2020

Our preliminary investigations confirm that using LIMIT may kill the performance of the query.

On a short term, can you consider only including it in the query when needed? This would avoid triggering the Index Scan Backward which induces a processing overhead that can be avoided.

On a longer term, it looks like the nested queries generated by the query builder would benefit from being simplified (by preferring joins over nested select) to drastically reduce the execution time.

@bertt
Copy link
Contributor

bertt commented Jan 22, 2020

ok, it would be helpful to have a small reproducible case, is there something available? Script should contain http post thing, http post locations in a loop and the to be optimized test query (http get).

@rouvoy
Copy link

rouvoy commented Jan 22, 2020

Sure, we can share the script we use to load the GeoLife dataset but, given the problem occurs when increasing the number of registered objects and locations, it may take some time to execute (although it would be a good case for benchmarking GOST).

@Alystrasz
Copy link
Author

You can download the scripts we used to upload GeoLife stuff into gost here : https://cloud.remyraes.com/s/yxp43RJmieqYpx7 (password is Geolife-STA-0)

We only uploaded locations for the thing n°153 of the dataset, which contains 2024 paths and took around 19h47 to upload to gost with this script.

(In case you're wondering, frost-bridge is a small sensorthings wrapper we use to create entities on our VM)

@bertt
Copy link
Contributor

bertt commented Jan 23, 2020

ok I can run download/import, but it's unclear to me where to specify the target SensorThings server.

@Alystrasz
Copy link
Author

By default, it targets our VM; however the function to change the target server is not exposed publicly, I will come back to you when it's the case!

@Alystrasz
Copy link
Author

I have to refactor the way we handle configurations within the library.

I still added a method for you to be able to target any STA server, using this code sample:

await Settings.setTargetServer('localhost', '/v1.0/', 8080);

using the 1.0.25 version of frost-bridge.


I uploaded there (password: Geolife-STA-1) the geolife script we used with a localhost config that you can edit to suit your needs (importDataset.js:101).

@bertt
Copy link
Contributor

bertt commented Jan 30, 2020

I've succesfully executed download/import (for nr 153), tool output:

$ npm run dataset:import 153

Thing n°153 | 2024/2024 files | 2159018/2159018 locations imported | 55:52

In the database I see now 2156994 locations, 3 things (1,2,3), 2156994 historicallocation. Only thing_id =3 has locations.

Some remarks:

  • Import is far quicker than the 19h47 reported (0h55)

  • Some locations are missing? 2159018 - 2156994 = 2024 location 'missing'

@bertt
Copy link
Contributor

bertt commented Jan 30, 2020

Perf testing, requesting /v1.0/Things(3)/Locations gives:

time="2020-01-30T10:20:53Z" level=info msg="Started GOST HTTP Server on :8080" package=gost.server.http
time="2020-01-30T10:21:40Z" level=debug msg="GET start: /v1.0/Things(3)/Locations" package=gost.server.http
time="2020-01-30T10:21:40Z" level=debug msg="constructing select query" elapsed=1.594726ms package=gost.server.database.postgis
time="2020-01-30T10:21:40Z" level=debug msg="execute select query: SELECT A_location.location_id AS A_location_id, A_location.location_name AS A_location_name, A_location.location_description AS A_location_description, A_location.location_encodingtype AS A_location_encodingtype, A_location.location_geojson AS A_location_geojson FROM (SELECT location.id AS location_id, location.name AS location_name, location.description AS location_description, location.encodingtype AS location_encodingtype, location.geojson::text AS location_geojson FROM v1.location  WHERE (SELECT thing.id AS thing_id FROM v1.thing INNER JOIN v1.thing_to_location ON thing.id = thing_to_location.thing_id AND location.id = thing_to_location.location_id WHERE thing.id = 3) IS NOT NULL ORDER BY location_id DESC LIMIT 2 OFFSET 0) AS A_location " elapsed=17.62085ms package=gost.server.database.postgis
time="2020-01-30T10:21:40Z" level=debug msg="GET done: /v1.0/Things(3)/Locations" elapsed=20.262687ms package=gost.server.http
time="2020-01-30T10:24:06Z" level=debug msg="GET start: /v1.0/Things(3)/Locations" package=gost.server.http
time="2020-01-30T10:24:06Z" level=debug msg="constructing select query" elapsed="185.966µs" package=gost.server.database.postgis
time="2020-01-30T10:24:06Z" level=debug msg="execute select query: SELECT A_location.location_id AS A_location_id, A_location.location_name AS A_location_name, A_location.location_description AS A_location_description, A_location.location_encodingtype AS A_location_encodingtype, A_location.location_geojson AS A_location_geojson FROM (SELECT location.id AS location_id, location.name AS location_name, location.description AS location_description, location.encodingtype AS location_encodingtype, location.geojson::text AS location_geojson FROM v1.location  WHERE (SELECT thing.id AS thing_id FROM v1.thing INNER JOIN v1.thing_to_location ON thing.id = thing_to_location.thing_id AND location.id = thing_to_location.location_id WHERE thing.id = 3) IS NOT NULL ORDER BY location_id DESC LIMIT 2 OFFSET 0) AS A_location " elapsed="902.895µs" package=gost.server.database.postgis
time="2020-01-30T10:24:06Z" level=debug msg="GET done: /v1.0/Things(3)/Locations" elapsed=1.399727ms package=gost.server.http

So first two requests within 20 ms. Any more ideas how to reproduce the behaviour you are seeing?

@Alystrasz
Copy link
Author

I used the tool to import the n°153 on a new gost instance in localhost, it created one single thing and imported all locations in 55 mins 52 secs too! Maybe excessive importation time is due to network stuff or VM configuration, will investigate about this.

(regarding the 2024 "missing" locations, due to 2024 also being the number of files, I highly suspect an error in the counter rather than in the importation process)


Requesting /v1.0/Things(1)/Locations now seems to behave normally:

time="2020-01-30T13:49:44Z" level=debug msg="GET start: /v1.0/Things(1)/Locations" package=gost.server.http
time="2020-01-30T13:49:44Z" level=debug msg="constructing select query" elapsed=1.405835ms package=gost.server.database.postgis
time="2020-01-30T13:49:44Z" level=debug msg="execute select query: SELECT A_location.location_id AS A_location_id, A_location.location_name AS A_location_name, A_location.location_description AS A_location_description, A_location.location_encodingtype AS A_location_encodingtype, A_location.location_geojson AS A_location_geojson FROM (SELECT location.id AS location_id, location.name AS location_name, location.description AS location_description, location.encodingtype AS location_encodingtype, location.geojson::text AS location_geojson FROM v1.location  WHERE (SELECT thing.id AS thing_id FROM v1.thing INNER JOIN v1.thing_to_location ON thing.id = thing_to_location.thing_id AND location.id = thing_to_location.location_id WHERE thing.id = 1) IS NOT NULL ORDER BY location_id DESC LIMIT 2 OFFSET 0) AS A_location " elapsed=11.508319ms package=gost.server.database.postgis
time="2020-01-30T13:49:44Z" level=debug msg="GET done: /v1.0/Things(1)/Locations" elapsed=13.49516ms package=gost.server.http

time="2020-01-30T13:53:06Z" level=debug msg="GET start: /v1.0/Things(1)/Locations" package=gost.server.http
time="2020-01-30T13:53:06Z" level=debug msg="constructing select query" elapsed="161.724µs" package=gost.server.database.postgis
time="2020-01-30T13:53:06Z" level=debug msg="execute select query: SELECT A_location.location_id AS A_location_id, A_location.location_name AS A_location_name, A_location.location_description AS A_location_description, A_location.location_encodingtype AS A_location_encodingtype, A_location.location_geojson AS A_location_geojson FROM (SELECT location.id AS location_id, location.name AS location_name, location.description AS location_description, location.encodingtype AS location_encodingtype, location.geojson::text AS location_geojson FROM v1.location  WHERE (SELECT thing.id AS thing_id FROM v1.thing INNER JOIN v1.thing_to_location ON thing.id = thing_to_location.thing_id AND location.id = thing_to_location.location_id WHERE thing.id = 1) IS NOT NULL ORDER BY location_id DESC LIMIT 2 OFFSET 0) AS A_location " elapsed="687.276µs" package=gost.server.database.postgis
time="2020-01-30T13:53:06Z" level=debug msg="GET done: /v1.0/Things(1)/Locations" elapsed=1.072188ms package=gost.server.http

We are now investigating towards VM configuration and SQL queries optimisation.

@bertt
Copy link
Contributor

bertt commented Jan 31, 2020

ok, i'm closing this issue. If there is a reproducible performance issue you can open a new one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants