Improve loadRecords performance in SQLiteNormalizedCache #1519

bharath2020 · 2020-11-15T03:04:08Z

Motivation:

We noticed the performance of loadRecords on SQLiteNormalizedCache degrades especially when the query field contains arguments making the CacheKey string larger in memory size. Per our readings, loading a nominal 200 records from the SQLite store takes around 4.2x seconds, and the fix in this PR decreases time consumed by loadRecords to 0.8 seconds an improvement of 4x the current time.

Here is the screenshot of TimeProfile indicating the time spent in the == infix operator.

What is the root cause of the performance issue?

To answer that let us take a look at existing code in the loadRecords method

let recordsOrNil: [Record?] = keys.map { key in  // Loop N times
         if let recordIndex = records.firstIndex(where: { $0.key == key }) {	   // Loop N Times    
              recordsIndexMap[key].flatMap { records[$0] }
              return records[recordIndex]		
         }		
      return nil		
}

Here are a few problems from the above code:

The first two lines from the above code have a complexity of O(N^2) excluding the key comparison using ==.
looking at the stack trace from the Time Profile in the attached screenshot, reveals that string comparison is not using the hash and instead compares each string for equality. This would increase the complexity exponentially by another magnitude.

The performance is evident in queries that involve query fields with arguments as shown in the below example

query {
     searchEmployees(organizationId: String!): [Employee]
}

For the query searchEmployees(organizationId: "<16 char UUID>"), a sample CacheKey for a record would be QUERY_ROOT.searchEmployees(orignzationId: "<16 char UUID>").<EmployeeField> and this makes CacheKey comparison even more time-consuming as the length of the CacheKey increases, which is what happened in our case.

Solution:

 let recordsIndexMap = records.indices.reversed().reduce(into: [:]) { resultMap, index in
        resultMap[records[index].key, default: index] = index
  }

 let recordsOrNil: [Record?] = keys.map { key in
        recordsIndexMap[key].flatMap { records[$0] }
 }

Reduce the complexity of the code snippet to O(N) following the below steps:

Convert the list of records [Record] returned by selectRecords method into a hashmap recordsIndexMap with key being CacheKey and value is the index of the record in [Record]. This provides an O(1) time complexity to lookup recordsIndexMap for a given CacheKey to find the matching Record from the list of records [Record] l
Construct recordsOrNil while doing a lookup with recordsIndexMap (basically a hashmap of CacheKeys) making the entire code snippet's complexity O(N)
thus bringing down the overall performance by the 2 fold magnitude

…hen queryFields involves arguments

bharath2020 · 2020-11-15T03:48:50Z

Looks like the tests are bit flakey. Also, I do not have permission to re-run the tests. Please advise.

designatednerd · 2020-11-16T00:33:54Z

Yes, the tests are extremely flaky - I'll kick them.

That's awesome that it seems like it helps performance a bunch - can you explain a bit about why this improves performance? Would also love to hear thoughts from @martijnwalraven, who's spent the last several weeks poking around in the cache

Sources/ApolloSQLite/SQLiteNormalizedCache.swift

bharath2020 · 2020-11-16T06:22:59Z

@designatednerd I have updated the original PR description (so future readers find it in one place) explaining the root cause of the problem and how I solved it. Hope it explains well.

Sources/ApolloSQLite/SQLiteNormalizedCache.swift

Co-authored-by: TizianoCoroneo <tizianocoroneo@me.com>

designatednerd · 2020-11-16T18:52:23Z

Awesome, thanks so much for adding the detailed explanation! Again, going to defer to @martijnwalraven for final approval but the explanation helps a ton.

Sources/ApolloSQLite/SQLiteNormalizedCache.swift

martijnwalraven · 2020-11-17T05:20:39Z

@bharath2020 Good catch! This looks all right to me, so I think we should merge it.

I'll be working on larger changes to query execution and caching though, and I think we may be able to eliminate this step completely. The only reason the returned records have to be ordered by the queried keys is because that is currently a requirement of the way DataLoader batch loads records. And since I'll be overhauling data loading, this seems like a good time to change that.

There are also a number of other performance issues that I hope to fix by these changes. If you have any time to work on this, it would be great to have more performance tests in place to test the changes against. I recently added a first test case for parsing, but we have no realistic tests of data loading. And to be honest, 0.8 seconds still seems pretty high to load 200 records from a local database, so I suspect there is more we can do.

bharath2020 · 2020-11-17T06:05:37Z

@bharath2020 Good catch! This looks all right to me, so I think we should merge it.
Thanks.

There are also a number of other performance issues that I hope to fix by these changes. If you have any time to work on this, it would be great to have more performance tests in place to test the changes against.

I will see what I can do.

And to be honest, 0.8 seconds still seems pretty high to load 200 records from a local database, so I suspect there is more we can do.

Definitely, 0.8 seconds is too high. I did follow up after the fix to see where the time is being spent. At this time it is still not clear. Will keep looking at it as actively as it is kind of a blocker for us.

…in SQLiteNormalizedCache.loadRecords

bharath2020 · 2020-11-17T19:13:45Z

@designatednerd Looks like this PR is good to merge. Let me know If there is anything else I need to take care of. Thanks.

designatednerd · 2020-11-17T20:27:53Z

waiting on last looks from @martijnwalraven - once I merge this I'll work on cutting a new version, I think it's time.

Improve loadRecords performance in SQLiteNormalizedCache especially w…

1b0c4e5

…hen queryFields involves arguments

Mordil reviewed Nov 16, 2020

View reviewed changes

Sources/ApolloSQLite/SQLiteNormalizedCache.swift Outdated Show resolved Hide resolved

rename recordsIndexMap variable in reduce block to resultMap

e1198dd

TizianoCoroneo reviewed Nov 16, 2020

View reviewed changes

Sources/ApolloSQLite/SQLiteNormalizedCache.swift Outdated Show resolved Hide resolved

Remove unused default parameter while setting value in resultMap

293dd3b

Co-authored-by: TizianoCoroneo <tizianocoroneo@me.com>

bharath2020 force-pushed the fix/load-records-performance-issue branch from fd31996 to 293dd3b Compare November 16, 2020 14:07

martijnwalraven reviewed Nov 17, 2020

View reviewed changes

Sources/ApolloSQLite/SQLiteNormalizedCache.swift Outdated Show resolved Hide resolved

Remove reversal of record indicies when constructing recordsIndexMap …

76d3431

…in SQLiteNormalizedCache.loadRecords

martijnwalraven approved these changes Nov 18, 2020

View reviewed changes

designatednerd merged commit 31634cf into apollographql:main Nov 18, 2020

designatednerd added this to the Next Release milestone Nov 18, 2020

martijnwalraven mentioned this pull request Nov 20, 2020

Convert to synchronous GraphQL execution and data loading #1531

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve loadRecords performance in SQLiteNormalizedCache #1519

Improve loadRecords performance in SQLiteNormalizedCache #1519

bharath2020 commented Nov 15, 2020 •

edited

Loading

bharath2020 commented Nov 15, 2020

designatednerd commented Nov 16, 2020

bharath2020 commented Nov 16, 2020

designatednerd commented Nov 16, 2020

martijnwalraven commented Nov 17, 2020

bharath2020 commented Nov 17, 2020 •

edited

Loading

bharath2020 commented Nov 17, 2020

designatednerd commented Nov 17, 2020

Improve loadRecords performance in SQLiteNormalizedCache #1519

Improve loadRecords performance in SQLiteNormalizedCache #1519

Conversation

bharath2020 commented Nov 15, 2020 • edited Loading

bharath2020 commented Nov 15, 2020

designatednerd commented Nov 16, 2020

bharath2020 commented Nov 16, 2020

designatednerd commented Nov 16, 2020

martijnwalraven commented Nov 17, 2020

bharath2020 commented Nov 17, 2020 • edited Loading

bharath2020 commented Nov 17, 2020

designatednerd commented Nov 17, 2020

bharath2020 commented Nov 15, 2020 •

edited

Loading

bharath2020 commented Nov 17, 2020 •

edited

Loading