Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable cache in ApolloServer when using RESTDataSource #1562

Closed
sharathm89 opened this issue Aug 22, 2018 · 33 comments · Fixed by osstotalsoft/generator-graphql-rocket#51
Closed

Comments

@sharathm89
Copy link

sharathm89 commented Aug 22, 2018

I am using RESTDataSource to fetch data from REST api's but data is being cached.

How to disable cache since data keeps changing in thequery resolver.

Below is the node.js code.

   const app = express();
    
    const server = new ApolloServer({
      schema: schema,
      context: ({ req }) => ({
        token: req.headers.authorization,
      }),
      dataSources: () => apiDataSources,
    });
    
    server.applyMiddleware({ app });
    
    app.listen({ port: 4000 }, () => {
      console.log("Server is at http://localhost:4000/graphql");
    });
@timkendall
Copy link

timkendall commented Aug 24, 2018

RESTDatasource will by default cache API responses based on their Cache-Control HTTP header. You can also manually override the set ttl like so in your RESTDatasource method:

public getUser() {
    // This overrides that behavior and caches user responses for 60 seconds
    // See https://github.com/apollographql/apollo-server/issues/1460
    this.get('/v1/user', null, { cacheOptions: { ttl: 60 } })
  }

Hope this helps!

@sharathm89
Copy link
Author

sharathm89 commented Aug 26, 2018

@timkendall thanks for your reply...

The above getUser() function is part of RESTDatasource class so during api call do we need to specify ttl

@timkendall
Copy link

It's up to you. Again ideally the REST API that you're hitting would respond with correct Cache-Control headers.

@lmx2k5
Copy link

lmx2k5 commented Oct 3, 2018

@timkendall I tried both setting ttl:0, ttl:1 and setting cache-control to "no-cache, no-store, must-revalidate" but it caches anyway indefinately (until server is restarted)

@dragonfriend0013
Copy link

dragonfriend0013 commented Oct 5, 2018

I have this same issue. I have enabled memcached for my datasource queries. Running the query a second time does not even try to query memcached. tested using nettop to see network connections. casuing major issues with application. LRU cache should be flushed after request is fulfilled, or not used if another cache is enabled.

Edit: after doing some debugging, RESTDataSource uses memoizedResults (in-memory Map caching) and prioritizes that over anything else. It does not ever delete this cache, so will never clear that memory. I am working to see if I can add code so that at the end of a request, this is destroyed.

@dragonfriend0013
Copy link

dragonfriend0013 commented Oct 5, 2018

@timkendall @martijnwalraven @lmx2k5
ok, i have solved the issue. basicially i added inside each of my datasources:

memClear() {
		return this.memoizedResults.clear();
}

and i am using the formatResponse for the server to loop through all of my datasources and calling that function. This way I know my GQL query is complete and memory is cleared before the next query is processed. if anyone has a better solution, i am all game for it.

@martijnwalraven
Copy link
Contributor

@dragonfriend0013 How are you configuring your data sources? They have been designed to be initialized per request, which is why dataSources takes a function. If they somehow stick around, the memoization cache won't be cleared.

@dragonfriend0013
Copy link

@martijnwalraven
I was creating an object with all of my sources and passing that in the server. I see now that it was reusing the same objects. I have changed it to add them individually in the server object.

@dragonfriend0013
Copy link

dragonfriend0013 commented Oct 12, 2018

@martijnwalraven
Now i have a slightly different issue. Even though i have a response in my memcached, how to i tell the get to not use the cached response and get from the data source directly. I have tried sending in the cache-control header on my GQL request, as well as on my rest datasource request (using willSendRequest) and it still hits the cache. I also tried setting the cacheOptions.tll to 0.

@anishsm
Copy link

anishsm commented Dec 13, 2018

memClear() {
		return this.memoizedResults.clear();
}

I tried this approach for the time being. but it is a hacky-wacky way to do it..
Any idea if this will be fixed or a parameter offered to control the behaviour of memoizedResults

@eberhara
Copy link
Contributor

We have a data source that should't cache the results and we solved by:

import { HTTPCache, RESTDataSource } from 'apollo-datasource-rest'
export default class NoCacheDataSource extends RESTDataSource {
  constructor() {
    super()
    this.baseURL = someHost
  }

  initialize({ context }) {
    this.context = context
    this.httpCache = new HTTPCache()
  }
}

Every call instantiates a new HTTPCache() which uses an in-memory cache by default.
So we basically have a per request in-memory cache in this case.

@DanielHoffmann
Copy link

ApolloServer cache option is not properly documented either:
https://www.apollographql.com/docs/apollo-server/api/apollo-server.html

I assumed that not setting it would not add any caching to my API

I also confirm that setting ttl: 0 is not working, also there is no actual API docs for apollo-datasource-rest. Checking the source I found out that cacheOptions can be a function receives Response and Request as parameters), for example:

this.get(url, null {
   cacheOptions: (response, request) => { // note that response is the first argument
      return { ttl: 0 }
   }
})

Unfortunately, manipulating the response here to remove expires or cache-control headers to prevent caching doesn't work either.

@DanielHoffmann
Copy link

Okay I found a way to completely disable the apollo-rest-datasource cache:

class KambiDataSource extends RESTDataSource {
    constructor() {
        super();
        this.cacheKey = 0;
    }

    cacheKeyFor(request) {
        return String(this.cacheKey++);
    }
   ...

this will create a unique cache key for each request. Ugly hack as that data will still be in memory so be careful if you use it

@Grimones
Copy link

Hi all
Can confirm that ttl: 0 isn't working
You can use ttl: -1 and it should be okay

The reason is that there are simple var checks on truthiness, for example
if (ttl)
but 0 is falsy which leads to wrong behavior here

let ttl = ttlOverride || Math.round(policy.timeToLive() / 1000);

and here
(ttlOverride && policy.age() < ttlOverride) ||

The fix should be checking ttl and ttlOverride at that lines for undefined

let ttl = ttlOverride !== undefined ? ttlOverride : Math.round(policy.timeToLive() / 1000);
(ttlOverride !== undefined && policy.age() < ttlOverride)

I'm happy to open PR if that's the real case

@DanielHoffmann
Copy link

@Grimones ttl: -1 is not working either, you can tell by defining your cache options like this:

{
        cacheOptions: (response, request) => {
            return {
                ttl: -1
            };
        }
}

put a breakpoint or console.log in the callback, whenever it triggers it means Apollo made a request instead of getting the data from the cache

The problem seems to be in RESTDataSource.ts:
https://github.com/apollographql/apollo-server/blob/master/packages/apollo-datasource-rest/src/RESTDataSource.ts#L238-L256

the Promise from this.httpCache.fetch is always memoized into this.memoizedResults Map for GET requests, disregarding the ttl. I don't understand the purpose of this.memoizedResults in that file, isn't the cache supposed to be handled in HTTPCache.ts only? RESTDataSource.ts is not supposed to care about cacheOptions or ttl, only repass it to HTTPCache.ts

@Grimones
Copy link

Grimones commented Feb 15, 2019

@DanielHoffmann but isn't it the idea of ttl: 0 to prevent from caching ignoring the headers which was in response? When ttl is set to 0 it doesn't cache the response thus it makes a request

Regarding this.memoizedResults memoization is different from caching.

Oh, I think I get your point.
If we haven't got a ttl set, and we have made a request - that request is being memoized and the caching is whatever the response from api returned.
If then we put a ttl, those ttl would be ignored because the request is used from the this.memoizedResuls

Maybe there is a point to add the ttl value to cache key?

edit
However checking ttl and ttlOverride has issues. It should be checking it against undefined

@timbakkum
Copy link

timbakkum commented Feb 17, 2019

@dragonfriend0013 How are you configuring your data sources? They have been designed to be initialized per request, which is why dataSources takes a function. If they somehow stick around, the memoization cache won't be cleared.

@DanielHoffmann I have the same issue and I suspect it comes from using the graphql-modules setup with RESTDataSource (like in the following example: https://graphql-modules.com/docs/guides/data-sources). It likely doesn't configure it properly as suggested by @martijnwalraven in the above comment, but I have to investigate further.

Even though according to my rest api response headers, my call should not be cached, the memoized response is returned in subsequent GraphQL requests and ttl cache option on the get does not affect that.

Caching and memoization could maybe be documented a bit clearer for RESTDataSource as there is (almost) no mention of how it works. If I understand correctly:

  • memoization's caching by RESTDataSource.ts should only work and last per GraphQL query (preventing the same rest request running multiple times per GraphQL query),
  • caching by HTTPCache.ts works if the rest request response headers allow it to be cached and any cacheOptions that you set in your different fetch methods of your RESTDataSource class (allowing you to cache rest requests across multiple GraphQL requests).

@DanielHoffmann
Copy link

@Grimones the memoization happening in RESTDataSource is not taking into account the ttl, that is why the responses are always cached forever. From what I can tell in the code RESTDataSource.ts is supposed to be stateless and HTTPCache is supposed to have all the state/caching

Checking the commit history it seems @martijnwalraven was the one who added the memoization to that file. Could you maybe take a look into this issue? The memoization seems to be holding old responses in memory and returning them instead of checking if they are in the cache

@martijnwalraven
Copy link
Contributor

martijnwalraven commented Feb 19, 2019

@DanielHoffmann While HTTPCache is responsible for long-term caching based on HTTP semantics and TTL, memoization makes sure we don't perform the same REST call twice within the same GraphQL request. This is similar to the deduplication DataLoader offers for example.

How are you instantiating RESTDataSource? A fresh instance of a data source is supposed to be created for every request, which happens automatically if you follow the recommended approach:

const server = new ApolloServer({
  typeDefs,
  dataSources: () => ({
    myAPI: new MyAPI(),
  })
});

@DanielHoffmann
Copy link

@martijnwalraven ugh, you are right, I had a single instance for my data source instead of one per request. Now I understand why you have memoization there (so multiple identical requests fired from the same API call are not repeated). Thanks for the tip!

This is quite an easy error to run into, maybe add some warnings about it? Compare the data sources instances to the previous request data sources instances. I suspect the other people here are running into the same error

@Grimones
Copy link

Also I think it's worth to make a note that you should use function instead of object when passing data sources
https://www.apollographql.com/docs/apollo-server/features/data-sources.html#Accessing-data-sources-from-resolvers

And again problem with ttl = 0 still persist
@DanielHoffmann now when you don't have problems with memoization you can see that 0 ttl isn't working
I'm not really sure that passing zero time to live is better than passing no-cache header
But in my stack I'm using cache control directives from graphql scheme and in resolver I'm getting the max-age value from info and use it when accessing api with RESTDataSource

I think I'll open a PR when will have free time checking ttl against undefined

@Zik42
Copy link

Zik42 commented Mar 3, 2019

What if cacheOptions.ttl, dont do anything?

@MakhBeth
Copy link

MakhBeth commented May 9, 2019

In my case, the issue was not the cache of the RESTDataSource class but a wrong initialization of the dataSources inside the new ApolloServer call:

export async function getApolloServer() {
  const schema = await buildSchema()
  const dataSources = await getDataSources()
  const apolloServer = new ApolloServer({
    schema,
    dataSources:  () => dataSources,
...

This was wrong because the dataSources variable was created once, when the ApolloServer started, so the cache was common to all the requests and not only per one request.

The solution for me was to initialize a new dataSources like written inside the doc

dataSources: () => {
    return {
      moviesAPI: new MoviesAPI(),
      personalizationAPI: new PersonalizationAPI(),
    };
  },

Hope this will help someone :)

@martijnwalraven
Copy link
Contributor

Closing this because the issue seems to be resolved.

@Alex0007
Copy link

{
        cacheOptions: (response, request) => {
            return {
                ttl: -1
            };
        }
}

async properties of response can not be awaited here, right?

@dchambers
Copy link

dchambers commented Aug 7, 2019

I needed a way to programmatically clear the default built-in cache, and was able to with:

this.httpCache.keyValueCache.store.reset()

@mrdulin
Copy link

mrdulin commented Aug 21, 2019

@MakhBeth Yes. Thanks. I have the same issue as you.

@RobertFischer
Copy link

@timkendall #1562 (comment) Is it true that RESTDataSource will cache based on the response's Cache-Control? Where does that occur in the codebase? How is the max-age directive respected? Do we check for the Expires header and respect that?

I would rather have a single RESTDataSource instance for all the requests in my server in order to reduce GC pressure, leverage caching between requests, etc. It seems to me that this caching issue is the only thing preventing me from using it that way.

@dchambers
Copy link

dchambers commented Aug 22, 2019

@RobertFischer, as you can see here it defers to the http-cache-semantics library. I wrote some unit tests around 'apollo-server' and was able to confirm that it respected both max-age and s-maxAge, though I never bothered to test Expires, but it claims that it's implemented.

What I did discover is that it doesn't fully support the Vary header out of the box (it only caches one variation at a time), so I wrote a VaryCachingRESTDataSource you can extend from if you need that.

@nharlow89
Copy link

I needed a way to programmatically clear the default built-in cache, and was able to with:

this.httpCache.keyValueCache.store.reset()

I was forced to do this as well. RESTDataSource seems to need slightly more control over the underlying cache. Several instances where one API call needs to invalidate the cache of another endpoint and there does not seem to be a clean way of doing this that I am aware of. Ultimately my solution was:

this.memoizedResults.delete(cacheKey)
this.httpCache.keyValueCache.delete(cacheKey)

@wemakeweb
Copy link

wemakeweb commented Jun 10, 2020

Generell Solution for @nharlow89 answer:

class NoCacheSource extends RESTDataSource {
  didReceiveResponse(response: Response, request: Request) {
    this.memoizedResults.delete(this.cacheKeyFor(request));
    return super.didReceiveResponse(response, request);
  }
}

But still pain in the ass.

@estevaolucas
Copy link

It's important to remove the cache for error responses as well, otherwise you will experience unexpected results.

class NoCacheSource extends RESTDataSource {
  deleteCacheForRequest(request: Request) {
    this.memoizedResults.delete(this.cacheKeyFor(request));
  }

  didReceiveResponse(response: Response, request: Request) {
    this.deleteCacheForRequest(request);
    return super.didReceiveResponse(response, request);
  }

  didEncounterError(error: Error, request: Request) {
    this.deleteCacheForRequest(request);
    return super.didEncounterError(error, request);
  }
}

@dgrichevsky
Copy link

It's important to remove the cache for error responses as well, otherwise you will experience unexpected results.

class NoCacheSource extends RESTDataSource {
  deleteCacheForRequest(request: Request) {
    this.memoizedResults.delete(this.cacheKeyFor(request));
  }

  didReceiveResponse(response: Response, request: Request) {
    this.deleteCacheForRequest(request);
    return super.didReceiveResponse(response, request);
  }

  didEncounterError(error: Error, request: Request) {
    this.deleteCacheForRequest(request);
    return super.didEncounterError(error, request);
  }
}

this worked for me!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.