Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing large volumes of TraFF messages massively slows down Navit #815

Closed
mvglasow opened this issue Jul 23, 2019 · 5 comments
Closed

Comments

@mvglasow
Copy link
Contributor

So far, the only real-world use case for traffic in Navit relied on TMC. Even at busy times, there would usually be no more than around 150 active messages, most of which would expire after an hour. This does cause some delays on startup, which can be annoying but is still manageable.

Now I have built another source, which gets its messages from the Internet, as many as 300 in one shot and some 600 active messages. (This can be controlled by enabling only the sources the user requires, but these values are typical for Poland and Lithuania together, needed when traveling in the border area.)

With that volume of data, Navit slows down to the point of being unusable for several minutes after startup. When a feed is received I expect similar effects, unless most of the messages are already in cache with the same location.

I see the following ways to address this:

  • Cut down on the time needed for location matching; reduces time to parse a location from scratch (both on receiving a new location and when reading cached messages):
    • Reduce the geographic area of the map to the minimum needed: less data to read from the map, smaller route graph to examine
    • Ignore certain way types (e.g. if the road class is MOTORWAY, we may consider ramps as well as trunk and primary roads, but not service roads): smaller route graph to examine
    • Examine only the tile levels needed for that particular way type: less data to read from the map, smaller route graph to examine
  • Cache traffic map, with reference to the map from which the data was obtained and to the message. On startup, check which messages have expired, and drop ways no longer associated with an active message. If ways are no longer present in the map from which they were taken, or their geometry has changed, re-parse the locations of the associated messages. This would speed up startup (unless the map has changed) but would not help with new locations.
  • Implement multithreading: message processing would run in a separate thread while other things could happen on the main thread. This is currently being worked on. However, every thread will need to access map data (and these operations could result in threads locking each other up), so there are still some big question marks. Also, this will not fully eliminate all performance issues, thus having the first two items will still be beneficial.
@mvglasow
Copy link
Contributor Author

Tested with a feed of 443 messages (with events relevant for Navit so they will not get discarded immediately). Original processing time (on startup) is around 3 minutes.

The geographic area originally was 1000 m around the enclosing rectangle of the endpoints. Reducing that to 100 m shaves a few seconds off the processing time (somewhere around 15 seconds for the whole feed). The padding is necessary to match locations with a LOW_RES fuzziness (which is the case for TMC but not here); I therefore now use 1000 m for LOW_RES, 100 m otherwise. Some padding will still be needed to allow for bends in the road.

Constraining the search by tile level (the one which corresponds to the road level below the one indicated) had no measurable effect.

Constraining the route graph to certain types (no more than one level below the target road level if that is MOTORWAY, TRUNK or PRIMARY) dramatically improved performance, with matching now taking only some 50% of the time. (Higher road types are always considered, as the bulk of ways are typically local roads, service roads and paths. Also note that the bulk of messages indicated a road class; this change will not improve performance on messages without a road class.)

Altogether, the 443-message feed which previously took some three minutes to process now takes up some 80 seconds. That is still a lot, but much better than before.

@mvglasow
Copy link
Contributor Author

Another approach to improve performance would be lazy matching: Locations are matched to the map only when a part of the map is queried that overlaps with their search rectangle (enclosing rectangle of all points plus padding). When a location is added, a redraw is triggered anyway (which results in a map query for the location being displayed); we would additionally need to check if a route is active and “tickle” its map rectangles on the traffic map in order for those locations to get resolved.

@mvglasow
Copy link
Contributor Author

…and for the test interface, we need to match all locations if we are to export them to GPX.

@mvglasow
Copy link
Contributor Author

#822 implements lazy location matching. No “tickling” of map rectangles; new messages have their location expanded immediately when they are found to overlap with the selection of the active route.

@mvglasow
Copy link
Contributor Author

mvglasow commented Aug 11, 2019

In the meantime, #822 has been extended to cache items upon resolving a location, and to restore these items on startup if they still correspond to map items (the latter may not be the case if the map has changed). If cached items are found to be stale (i.e. map data has changed since), they are discarded and will be regenerated from scratch the next time they are needed.

That implements almost all of the proposed features:

  • Minimize map reads to build the route graph for a traffic location
  • Minimize the number of segments in the route graph for a traffic location
  • Resolve locations from scratch only when they are needed
  • Cache resolved locations and reuse them as long as the location and the map data have not changed

The only remaining feature is multithreading, which is a separate issue (#760). Therefore this can be closed as soon as #822 is sufficiently tested and merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant