Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query watcher not being called when cache is updated on an element by another query/subscription/mutation #1422

Closed
benoitletondor opened this issue Sep 30, 2020 · 38 comments · Fixed by #1889
Labels
bug Generally incorrect behavior caching

Comments

@benoitletondor
Copy link

Bug report

Query watcher not being called when cache is updated on an element of a collection that is added after calling watch

Versions

Please fill in the versions you're currently using:

  • apollo-ios SDK version: 0.34.0
  • Xcode version: 12.0.1
  • Swift version: 5.3

Steps to reproduce

I've been trying a lot of things on that one and wasn't able to find a way to fix my problem. So in my app I'm having a list of conversations, so I'm creating a watch on a query that returns the list of conversations, this is something like that:

user {
    uuid
    firstName
    lastName
    conversations {
        uuid
        unreadMessagesCount
    }
}

In the app we use the uuid key to handle cache, so we make sure to always pass uuid in our queries to automatically handle cache update.

So the role of the watcher I'm talking about is both to update existing conversations and also be able to catch when a new conversation is created (it can happen and not be initiated by the user, we then trigger an event from the backend that is listen from a subscription in the app). When this event happens it returns something like that

event {
     newConversation {
          conversation {
               uuid
               unreadMessagesCount
               // This is the part that adds the new conversation to the existing ones of the users in the cache
               user {
                    uuid
                    conversations {
                          uuid
                          unreadMessagesCount
                    }
               } 
          }
     }
}

It works great, meaning that when this event happens, the watcher is being called with the newly created conversation, but the issue is that any new cache update for that specific conversation doesn't trigger the watcher again.

After investigating a bit, I realised that the cache is being updated because if I'm adding 3 new messages into the new conversation (setting the unreadMessagesCount to 3), the watcher doesn't get called but then if I add 1 new message into an old one, the watcher is being called with both the new message on the old conversation and the 3 new on the new one.

So it really seems like watch is not being called again for changes on an item that wasn't in a collection when the watch was initially made. I've also take a look at #281 and making a fetch on the same query again after the event doesn't fix the issue.

Let me know if I'm not clear on something as the whole thing is a bit complicated to explain.

@benoitletondor
Copy link
Author

To add more to this, I feel like this is not related to collections only. If you have 1 query that watches something, then another query/mutation updates this cache indirectly (not using the same query, but another query that contains the same element with the same uuid), it doesn't work.

@benoitletondor benoitletondor changed the title Query watcher not being called when cache is updated on an element of a collection that is added after calling watch Query watcher not being called when cache is updated on an element by another query/subscription/mutation Sep 30, 2020
@designatednerd
Copy link
Contributor

Do you have any thoughts on why the test that was added would be passing, but what you're working with would not be working?

@benoitletondor
Copy link
Author

I'm afraid no, but all I can tell is that it works with 0.32.0 and not with 0.34.0

@Narayane
Copy link

Narayane commented Oct 13, 2020

Hi @designatednerd,

I think I have the same problem but not on a collection. My watcher is not triggered anymore since I have updated Apollo from 0.31.0 to 0.34.1. I have a watcher in cachePolicy .returnCacheDataAndFetch on this Query.

query GetRide($rideId: ID!) {
    ride(id: $rideId) {
        ...EndRideFragment
    }
}

And I have a mutation that updates my cache and normally triggers my watcher.

mutation StopRide($input: StopActiveRideInput!) {
    stopActiveRide(input: $input) {
        ride {
            ...EndRideFragment
        }
    }
}

Since my Apollo pod upgrade, not working anymore. My watcher is not triggered when mutation payload updates my cache.

@dhritzkiv
Copy link
Contributor

dhritzkiv commented Oct 14, 2020

This is also occurring for me. 0.33.0 (non beta) worked; but not 0.34.0 or 0.34.1

@designatednerd
Copy link
Contributor

@dhritzkiv That makes sense, the network stack and how it interacts with the cache changed completely in 0.34.0.

Apologies, I've been under the weather. I'm gonna do some digging starting tomorrow to try to figure out how I broke this without breaking the tests 🙃

@designatednerd designatednerd added bug Generally incorrect behavior caching labels Oct 15, 2020
@dhritzkiv
Copy link
Contributor

No rush from me! I was able to roll back to 0.33.0 no problem

@Narayane
Copy link

Narayane commented Oct 15, 2020

I can wait 1 week to have feedback from you @designatednerd on this issue.
If you need more time to investigate or to solve it, I could roll back to 0.33.0 to avoid blocking me in production

@benoitletondor
Copy link
Author

Same here, I rollbacked to 0.32.1 so no rush from me, and thanks for taking time investigating this!

@martijnwalraven
Copy link
Contributor

martijnwalraven commented Oct 22, 2020

I'm trying to reproduce this issue to investigate what could be going on, but it's been hard because unfortunately our tests aren't set up to run in real life conditions, where concurrency may play a role. The existing watcher tests all continue to pass, and so do some new ones I've been adding.

It would be helpful to learn more about the way people who experience this are using the framework, and what behavior they are seeing:

  • How are you initializing and configuring the client, cache, store and network transport?
  • Are you calling methods on ApolloClient from the main thread or from a background thread? Do you receive results on the main queue as well, or are you passing in a custom queue?
  • Does the issue occur 100% of the time, or only sometimes?
  • Are there other operations in flight when this happens, or does it also happen with just a single watcher and a single related query or mutation?
  • Anything else unusual or noticeable?

I know this is a long shot, but it would be even more helpful if anyone was able to share a project that reliably exhibits this issue.

@martijnwalraven
Copy link
Contributor

martijnwalraven commented Oct 27, 2020

A few more thoughts after taking a closer look at this:

  1. It seems any operations using the WebSocketTransport always ignore the cache completely starting in 0.33 (I believe 444c465 is where this change was introduced). That means subscription results are not actually published to the store, and thus will not trigger watchers. (@benoitletondor I think that would at least explain your initial bug report).

  2. That doesn't yet explain why queries and mutations would also fail to trigger watchers (assuming those are not using the WebSocketTransport). There is a possibility for misconfiguration however, which would lead to inadvertently having multiple stores (see Mismatched stores for ApolloClient/LegacyInterceptorProvider #1438). Could some of you be bitten by this maybe?

  3. Using a custom NetworkTransport also currently ignores the cache completely (see Add helper to deal with parsing and caching raw data in custom NetworkTransport instances #1442 for an in progress PR to work around this). Any chance this could explain the behavior some of you are seeing?

@dhritzkiv
Copy link
Contributor

I'm not using WebSocketTransport, but 3. sound like it could be what's plaguing us! Will have some time in two weeks to poke around that

@Narayane
Copy link

Narayane commented Oct 28, 2020

@martijnwalraven I guess 3. could be an explanation in my case, I have a custom NetworkTransport implementation (http request headers additions, checks on http response)

public func send<Operation: GraphQLOperation>(operation: Operation, completionHandler: @escaping (_ result: Result<GraphQLResponse<Operation.Data>, Error>) -> Void) -> Cancellable in 0.33

becomes

public func send<Operation>(operation: Operation, cachePolicy: CachePolicy, contextIdentifier: UUID?, callbackQueue: DispatchQueue, completionHandler: @escaping (Result<GraphQLResult<Operation.Data>, Error>) -> Void) -> Cancellable where Operation : GraphQLOperation in 0.34

I need to change

do {
    let body = try self.serializationFormat.deserialize(data: data) as! JSONObject
    let response = GraphQLResponse(operation: operation, body: body)
    completionHandler(.success(response))
} catch { }

into

do {
    let body = try self.serializationFormat.deserialize(data: data) as! JSONObject
    let response = GraphQLResponse(operation: operation, body: body)
    let result = try response.parseResultFast()
    completionHandler(.success(result))
} catch { }

to try to conform my code to the new signature

@Narayane
Copy link

@designatednerd @martijnwalraven any progress?

@designatednerd
Copy link
Contributor

not yet - also, be aware that parseResultFast doesn't hit the cache, so you'd probably need to use the completion-closure based parsing.

@Narayane
Copy link

What?
A code example could save my day :)

@designatednerd
Copy link
Contributor

@Narayane @benoitletondor @dhritzkiv I think this should be resolved by 0.37.0. Could y'all give that a try and let me know if it's still an issue on that version?

@dhritzkiv
Copy link
Contributor

@designatednerd A preliminary run using 0.37.0: it looks like the cache+watchers work as expected! Thanks. :)

@Narayane
Copy link

Narayane commented Nov 25, 2020

@designatednerd Still an issue in my case, watchers are not triggered in 0.37.0 with the code below in my custom NetworkTransport.

do {
    let body = try self.serializationFormat.deserialize(data: data) as! JSONObject
    let response = GraphQLResponse(operation: operation, body: body)
    response.parseResultWithCompletion(completion: { result in
         switch result {
         case .success(let gqlResult):
              completionHandler(.success(gqlResult.0))
         case .failure(let error):
              completionHandler(.failure(LRError.clientError(error.localizedDescription)))
         }
     })
} catch { }

@designatednerd
Copy link
Contributor

@Narayane Yeah, the caching on custom network transports is basically broken at the moment, so that would make sense. The current workaround is to write a custom interceptor for your network usage similar to the default Network interceptor but using your own networking.

I'm working on a longer-term fix but this is something that was missed through the RFC and Beta processes so there's a decent amount of rethinking that needs to be done there, unfortunately 😭

@Narayane
Copy link

Narayane commented Jan 8, 2021

Hi @designatednerd,

any progress on my problem?

It begins to be disturbing to be stuck in v0.33 since v0.39 is the current...

@designatednerd
Copy link
Contributor

At this point the recommended workaround remains creating your own custom network interceptor and using the RequestChainNetworkTransport rather than creating your own network transport - this should work in almost all cases, I would love to hear any cases where it doesn't so I can fix whatever needs to be fixed to make that work.

This is definitely something that needs to be revisited but at this point i think the workaround is sufficient that it'll be better long-term to do it as part of enhancements to the chain system that will make it easier to use this stuff for all network requests (including, hopefully, web sockets).

@Narayane
Copy link

Narayane commented Jan 29, 2021

@designatednerd Ok, I don't have time for the moment but I would try.

Do you know any example I could follow to help me?

@ketenshi
Copy link
Contributor

ketenshi commented Feb 1, 2021

Hi all,

Is there a recommended way to deal with the subscriptions from the WebSocketTransport not publishing to the store?

@designatednerd
Copy link
Contributor

@Narayane I don't have an example, but if you look at how the default network intercept provider is set up, you'll see it's fairly simple.

@ketenshi That's a more complicated issue, unfortunately - we don't presently have a recommended workaround but i'm working on some thoughts around it.

@Narayane
Copy link

Narayane commented Feb 15, 2021

@designatednerd

In my current implementation of NetworkTransport:

let task = self.session.dataTask(with: request) { (data: Data?, response: URLResponse?, error: Error?) in

I could make out cases like not an httpResponse (client or server timeout) guard let httpResponse = response as? HTTPURLResponse, no data guard let data = data and all http response status codes (2xx, 4xx, 5xx, ...) to determine precisely what kind of error it was.

https://github.com/apollographql/apollo-ios/blob/main/Sources/Apollo/NetworkFetchInterceptor.swift

let task = self.client.sendRequest(urlRequest) { [weak self] result in
  switch result {
      case .failure(let error):
      case .success(let (data, httpResponse)):
  }
}

I guess I can make out anymore now, with this code above, not an httpResponse and no data cases, right? If yes, it is a problem for me because I need to retry request in some specific cases considering returned error type. Can you enlighten me?

@designatednerd
Copy link
Contributor

If you're using our URLSessionClient, it will return a specific error if no HTTPURLResponse is returned (the TaskData object does the work of checking if it's an HTTPURLResponse.

Data will always be returned since we need to instantiate the data to keep appending to it on the TaskData, but you can check if it's empty.

HTTPURLResponse has a statusCode property you can check for the exact HTTP status code returned.

Does that help?

@Narayane
Copy link

Narayane commented Feb 15, 2021

I don't know if I am currently using an URLSessionClient. In my code, I just instantiate an ApolloClient with an ApolloStore and a custom NetworkTransport.

self.transport = LRNetworkTransport(url: endpoint)
do {
    let cache = try SQLiteNormalizedCache(fileURL: cacheFilePath)
    self.store = ApolloStore(cache: cache)
    self.client = ApolloClient(networkTransport: self.transport, store: self.store!)
} catch {}

Your other answers help me, yes.

@designatednerd
Copy link
Contributor

You are not if you're providing the alternate network transport.

@Narayane
Copy link

Narayane commented Feb 16, 2021

Is it ok if I follow this to make evolve the way I build my custom Apollo client for resolving my watchers trigger problems with v0.34+?

@designatednerd
Copy link
Contributor

I'm not quite sure what the question is here - that is currently the recommended setup. Can you clarify a bit what the question is?

@Narayane
Copy link

Narayane commented Feb 17, 2021

Hi @designatednerd,

Seems to solve my watchers trigger problems with v0.41.0, I just had the same problem than him with custom InterceptorProvider.

Thanks for your support :)

@benoitletondor
Copy link
Author

@designatednerd

Taking the occasion to jump again on the conversation. Please let me know when you have news around the WebSocketTransport not publishing into the store bug, this is the last one preventing us from using the latest versions.

Many thanks!

@designatednerd
Copy link
Contributor

Planning on tackling this soon!

@tomasyany
Copy link

@designatednerd Do you have any updates on this issue?

@designatednerd
Copy link
Contributor

Thanks for the poke! Right now the main update is that we've had to roll back Starscream to an older version due to some crash issues a bunch of people were experiencing.

We're looking at how to address this hole (and others in the current request chain implementation) in a more long-term way right now, but I don't have an ETA.

Apologies for the back and forth on this. We've got another dev who's getting ramped up now, so I'm hoping we'll have a much better process and predictability around all this soon.

@tgyhlsb
Copy link
Contributor

tgyhlsb commented Aug 2, 2021

After investigation, the issue is more about the type of transport, aka WebsocketTransport rather than the type of operation. I've opened a PR with a solution.

@kroussevrb
Copy link

@designatednerd what should be the difference between our own NetworkFetchInterceptor and the one in Apollo to make this work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Generally incorrect behavior caching
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants