-
Notifications
You must be signed in to change notification settings - Fork 394
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
blog/website: choose the new hosting #992
Comments
@fabiosantoscode please, take a look. Let's discuss it here. |
I agree with you, in that the holy grail is to have our API be just a bunch of static data updated by a cron. It's similar to the architecture I left behind at a paper, where events would wake up a lambda function which rendered a react page, and then push the result to S3. This was called the reactive manifesto back in the day. However in our situation, where DX is key, I don't really want to write and maintain a bunch of scripts to support a per-PR environment and work differently in production. Ideally we don't have any code handling this. The reason is, that both websites are so static that it's not really worth it to even have a load balancer, and a single server with a memcached instance could certainly cut the mustard when you hide it behind cloudflare or another caching CDN. I've talked with @shcheklein a lot about this a lot yesterday and one thing we touched on a lot of times was running workers at the edge. However I've given it more thought, and I don't think we need this capability if our API is made up of mostly static things. Instead, it can just slowly spread across the CDN and be served from the edge. It's OK even if it takes a few minutes to update. If it's not OK, we can invalidate the cache. Heroku is pretty flexible. It doesn't have functions at the edge (that I know of), but I don't think we need them. It also has a nice DX. My proposal: I think we can use just heroku. We put a CDN in front of it for production, I don't think it matters which (cloudflare or cloudfront). We stay away from edge functions or pushing to the CDN, and for simplicity allow it to fetch from our poor server. If we find that it is not enough, we can start pushing to the edge. Cloudflare allows for this with workers + KV, and cloudfront allows this by having an S3 bucket as an origin and pushing to it. We will need a memory cache store. This is because we will need to preserve the content and etags we get from the Github API. The reason we store the etags as well is that if we do a request to github with an There's no reason we can't serve our gatsby files from an express server. Zeit's
So this way, we can have a local development server ( We will still have the flexibility of setting nice cache headers, which are read both by the browser (and cloudflare as well as your typical run-of-the-mill cache server or CDN). For example, I guess this is not in-scope, but just in case, for dynamic page content like comments, we have a few options:
|
@fabiosantoscode thanks, good summary and good points. what about the CD part if we stick to Heroku - we need a fast way to build and deliver Gatsby to it. What do you think about deploying to both - Heroku (APIs + cache) and Netlify (static stuff)? |
Just a few comments from me for now: On redirects, Netlify seems to have this built in 🙂 https://docs.netlify.com/routing/redirects/
I think this is pretty important for the docs process (for local sanity checks). Demo stands also help but its faster, easier, and cheaper to run the site locally.
MD files in public/static/docs/ are the ones that change the most. Very often. (But we don't serve them directly to users atm, the web app "proxies" them via
ESI is meant more for personalization I think i.e. if we had user logins. If this is probable in the future road map then I think it's a good option to incorporate now. In general I just have the impression we may be trying to use too many technologies? Could it all be plain Gatsby (whatever that means) or all microservices or all serverless functions + cdn? |
I say we take advantage of their built-in offering. They also have a cache folder where we can cache the gatsby folder and the public folder, such that things like generating thumbnails is done incrementally. Remember that there will always be a build step. Locally (on an intel i7) a build on a hot cache takes 19 seconds. The real bottleneck will be saving and restoring cache, so the more images we have, the slower the build will be. We can try to tackle this in the future when it becomes a problem. It will be complicated, yet unavoidable. Deploying to both Netlify and Heroku would kind of kill the development experience of sending a PR and getting a nice link to a temporary environment: we would have to find a way to connect netlify environment to the heroku API. Additionally, netlify is running a node process anyways :)
You make a good point here. I had another look and found this middleware from gatsby. We can use it to embed gatsby's logic into our express server. This is pretty close to plain gatsby. It could get closer if there was a way to use middleware in production like it's possible in dev. They might not allow middleware in production because they mean for gatsby sites to be statically hosted. I would like to argue though, that having a single node process with express do everything, and put a CDN on top of it for DDOS protection, speed and edge caching, shouldn't be too many technologies. Even if it's under the hood, this is mostly what we have now ( After having written all of this, I think we have a real option to kick the can down the road by continuing to use netlify. Depends on the price of their cdn though. We would still write an express server to serve gatsby and our APIs, but it wouldn't have any memory cache. It wouldn't be ideal, but I have no reason to believe it wouldn't be webscale. |
@jorgeorpinel it's not the case with Gatsby - we serve pre-built static HTMLs that include processed MD in them
@jorgeorpinel I don't think that is flexible enough. I would prefer to keep the redirects logic that we have - including tests, etc.
@fabiosantoscode for some reason I had an idea that it's not possible with Netlify - running your own server with in memory cache that serves APIs externally. Heroku alone sounds like a good option (+some cache like CDN like Cloudflare). Obviously with some CD (if Heroku can do it - fine, if not - Gatsby as a business has something?). And we have all the flexibility we need, up to having databases if needed. This solution should be very simple to deploy, runs locally, has previews, edge caching is done by Cloudlfare ... any real downsides to this? cc @iAdramelk @jorgeorpinel @fabiosantoscode |
Sorry for the long answer guys. I think we are overengineering it a little. In the perfect world I prefer not to have our own Express server at all:
The only problem that we have with static approach is hosting and updating API functions and caching their results. And, for example, Netlify allows us to solve this as well using Netlify Functions. Here is an example of using Netlify Functions to fetch remote API. It's not that different from our current API implementation and can be deployed and updated as a part of our normal deploy process to Netlify. I'm not sure that going with Netlify is the best option because I'm not sure that we can optimize our build time on Netlify and I'm not sure from the get go how to cache results of such serverless functions between calls, so I'd like to check other options too, like Heroky, now.sh, etc. But ideally what I would like to have in the result:
For the local development we can either mock these functions or use already deployed one's. I doubt that we will be updating them this often. We can even place them in the other repository and deploy them separately. |
What do you mean by manually, what are the benefits you see in not using an express or something else before Gatsby?
most likely Cloudflare/Netlify do proper headers already?
I doubt the Netlify redirect's config is flexible enough to handle what we need. Probably, heroku's one is the same, but I haven't checked (but of we go all static Heroku does not make much sense anyway)
I like, but it feels like it might complicate the workflow, deployment, local experience .. would love to try before we jump into this.
Sounds like a complicated setup to me. Would love to see something like |
Agree, that was my point too 🙂 In general I also incline for as static as possible, and built in redirects so we don't need our custom module for that. Built-in redirects probably have much better load capacity, for example. p.s. from what I read, Netlify
Wha about markdown files? I'm still confused about this part but probably when the Gatsby migration is ready and I get to see it I'll be clearer, so no need to ask this Q. Let's just keep mind that we change MD files very often.
We could just have the API as a separate node app. I checked pages/api/** and it seems totally stand-alone anyway. (This way also in the future it's possible to pass the API through an authentication/ rate limiting gateway e.g. KongHQ if ever needed.) Serverless approach also works but maybe its easier to maintain as a regular app to have the same deploy process, and also to reduce the system complexity? Agree with Ivan here. |
p.s. this issue is kind of long, would be great to summarize options. I'd do it but I'm not sure I understand every comment completely. |
My main concern is local development. Using server before static folder in prod is not a problem at all. But if we use Express locally we will need to run it alongside with the gatsby dev server on separate ports and we will need to proxy calls from one port to another. There is also a problem that the port is hard-coded in the resulting html so we will need to somehow update ports in the code that gatsby server generates while gatsby server is still runs on the original port. I didn't research this topic in depth and it is possible that there is an existing plugin for that or that this is easy to configure. But if not, we will need to write a maintain a lot of our own code for that instead of just starting default gatsby dev server with standard command.
That's my point, we don't need a server for that. We just need to create unique names and static hosting will do the rest. But with our own server on Heroku we will need to do it ourserves if I understand correctly.
Do you have examples of the redirects that you think we would not be able to implement? I had a fast look at the docs and I think that everything that we have in the
Well, if we update them often then yes, but I think that we probably just push them once and them forget about them for a year or so. This way we can just use global urls for the local development.
It's a little more complicated that I would like to, yes. But I think that this is a choice between this or the problems with local server above. Not sure what is better to implement between them.
It's not a problem. We can automatically optimize them and update their paths with gatsby, we already are doing it in the blog.
My main concern here is running it alongside with the gatsby dev server (see my answer to Ivan above). |
Looks like we can't optimize the build time. I've given it a try here: https://github.com/iterative/blog/pull/115. We bust through the netlify cache limits, even without caching image processing (which is our biggest bottleneck I think). As per caching the results of the serverless functions, if we set a cache-control and expires header, the CDN/cache will take care of it, as well as the browser.
now.sh is a real contender, I feel. You can use the The serverless functions in now.sh, as expected, are cached on their end if you use cache-control header (scroll down to "serverless functions". I think we shouldn't get too hung up about server-side caching in any of these solutions. Basically all of them respect the cache-control header. The header is not only meant for browsers, but for any kind of proxy as well. That's what the |
I've looked into heroku, and they limit your build cache to 500mb. I did a small test with now.sh for the blog (changed 2 lines in the package.json), here it is: https://blog-fihp2x2rk.now.sh/ Here's a function with a 60 second cache using a cache-control header: https://blog-fihp2x2rk.now.sh/api/example-function After the first build, this took in total 3 minutes to deploy, including the build time. Locally, Integration with github is also possible, providing us with per-PR environments. I've read through your comments @jorgeorpinel and @iAdramelk, and this seems to tick all the boxes for you. Unless anyone has any issues with this solution, when I run out of things to do I'll be sending a PR. |
I was really disappointed to find out that zeit now no longer supports custom servers. They do support adding routes in JSON, which is working fine for /doc/* (including status codes). |
@fabiosantoscode it looks like exactly that we need! |
@fabiosantoscode a few more questions - how much will it cost us to build with them if we support previews? would love to explore a more conservative option with Heroku as well - in terms of price, build time (if cache is enabled), and local experience (you mentioned some middleware?) I still concerned with these fancy options like zeit and Netflify to be honest. I really don't like their aggressive pricing models, I don't like waiting minutes to deploy a preview (to some extent Gatsby's problem not hosting)? Bottom line - can we do better? |
To be precise - we pay for Heroku up to $50 / month since we do a lot of preview deployments. It's up to 30 hours with pro plan. Will it be enough? Most likely, yes. The thing I still don't like is waiting minutes to deploy. If blog takes 3 minutes, it'll be > 10 minutes to deploy blog+dvc.org. Is there a way around it? |
Our major bottleneck is the generated images in the blog. If we can store these images on S3 using DVC, and if with DVC we can somehow generate thumbnails only for images which changed, without downloading everything (can it?), we might be able to host them from S3 directly (using the DVC remote cache URLs). This might be accomplished through a source plugin which stores the checksums of the images to see which ones changed, or with If we can do this, then heroku can be very speedy (we just need to cache node_modules, .gatsby and public, which becomes small enough to be cached). However I think heroku is a bit overkill for us, and it doesn't include a CDN to cache things at the edge like netlify and zeit now.
I had a look at their pricing page, and overall I think we can go with the $20 plan. It gives us unlimited deploys, and 10 hours of build time every month. This gives us around 120 builds every month (if they took on average 5 minutes). If we go over the 10 build hours, we pay $10 more, instead of being forced into the $200 plan. We're limited to 3 team members, which I suppose is users with admin access, not users deploying. Couldn't find any specifics on this, so I'll go and ask directly. |
@iAdramelk keeping the conversation on this ticket
It's not in this PR, since the function is in the blog. I didn't share it either. Here it is:
(When we have gatsby, our functions will be in api, not pages/api. zeit now piggy-backs on the existing concept of nextjs functions so they use the same folder) edit: the version2 property was to tell deploys apart and make sure they were deploying the new function correctly. |
Heroku dev with a single command, see above. Cache limit is going to be hit inevitably.
In the blog it's mostly modules, but images are a huge chunk. If we add dvc.org to it the modules won't grow by much, I think. And we don't have too many images.
Largest modules are typescript, babel, core-js and rxjs. I looked through gatsby-plugin-sharp a lot. I'm really looking for a way for us to store the images elsewhere during the build (like S3), to get around the cache issue while not always regenerating images. During production we could proxy image requests from the app to S3 (or if we have a CDN we can do the proxying from there). I really think this is the way to go. gatsby-plugin-sharp clearly has a way to avoid re-compressing images from the filesystem if they haven't changed. If the filesystem is just another source, why not S3? |
@shcheklein I've made the edits. Except for this one:
The extra costs I added are accounting to our own expansion of the build cache, and the fact that heroku doesn't feature HTTP caching at all, much less a CDN.
We will probably be using S3 even more, since the maximum deployable in heroku is 500mb and it probably includes static files. |
CloudFlare handles this for free, right? And we already run everything through it. And we can utilize in-memory + CDN cache easily with Heroku for API cache. Again, for free and no changes are required.
so, it's not related to images. How do people deploy JS apps to it anyways, then? |
I forgot about cloudflare :) Cloudfront / S3 also have a free tier, it just depends on how much of it we can use. But hey, if we're using cloudflare, we could place the API in cloudflare workers as you mentioned before, and make use of cloudflare KV as an efficient memory store for storing github etags and responses. And deploy the rest of the site using gatsby's own thing, which will give us the fast builds we want.
It's pretty related to images. Your typical JS app is smaller than 500mb especially when installing only production dependencies. Here's the source: https://devcenter.heroku.com/articles/slug-compiler#slug-size |
I also can recommend to take look at aws with Amplify based on S3+Cloudfront+other aws services |
it's not cleat if it's easy to run them locally in this case. I would avoid this fancy stuff because of this. Unless there is a simple solution. |
@shcheklein there's a few solutions for local dev. One of them is this. this is a tad unmaintained but seems promising. PR environments is where it breaks down. We'd have to use production APIs there. |
@JIoJIaJIu I've used amplify before, it's great. I saw, however, that their automated PR environment thing only works on private github repositories, to avoid unsolicited PRs increasing costs. We can always roll our own PR environments. They do seem to have local dev facilities though, and the flexibility is through the roof since we're free to use any AWS service with it without going through the open internet. |
Sorry for the long silence guys. Make some testing by myself. Some results: Gatsby image build optimisationCurrent image count:
To trigger rebuild I edited title field in the Local
Local, after updating to the latest
|
Total | GQL queries | Images | |
---|---|---|---|
No cache | 3m 55s | 35s | 3m 46s |
Cache | 24s | 13s | 0s |
Definitely still broken, so immediately reverted.
Local, with webp disabled
Total | GQL queries | Images | |
---|---|---|---|
No cache | 1m 42s | 27s | 1m 22s |
Cache | 22s | 11s | 0s |
Gatsby cloud
Total | GQL queries | Images | |
---|---|---|---|
No cache | 12m 45s | 1m 29s | 10m 15s |
Cache | 57s | 36s | 0s |
now.sh
Total | GQL queries | Images | |
---|---|---|---|
No cache | 3m 42s | 38s | 3m 14s |
Cache | 30s | 21s | 0s |
Netlify
Total | GQL queries | Images | |
---|---|---|---|
No cache | 8m 8s | 54s | 5m 54s |
Cache | 7m 4s | 46s | 5m 39s |
Some closing thoughts
Longest part of the build process is by far thumbnail generation on the first build. We are now generating more than 1K images.
Disabling webp reduces build time by approximately 1/3, but make end user's experience worse.
Rebuild time with existing cache (node_modules
, .cache
and public
folders) are quite fast.
If hosting provider caches them like now.sh or gatsby cloud, then rebuild time can be less than 1m.
Netlify is by far slowest option between the ones that I tried and definitely didn't cache public
folder. And even if we enable its caching overall build times still would be the longest. So I'd say we can safely remove it from the candidates list.
P. S. One more thing. After I updated |
Thanks @iAdramelk! |
I think this shows how much a good cache can influence build times. Our best options are clearly now.sh and gatsby cloud. However, gatsby cloud doesn't come with API endpoints and is a bit pricey. Using cloudflare for local development is not very optimal, so I propose we get rid of it for local development, and in production use a worker which takes every request to In PR environments, our APIs are the production APIs. Locally, our browsers are going to resend the etags headers to the API, not increasing our limits. If this doesn't work we can always wrap our functions in a caching wrapper. |
Fastly is capable of doing if-none-match requests to our servers if we respond with etag headers. Therefore, if we are using this, we can do an if-none-match request to GitHub using the same etag, and respond with a 304 (or 200 if anything changes on GitHub's side). Then fastly will remember the old response and serve it. I'm going to check whether fastly uses a global cache or if a request from China can't use a cached response from Europe. |
So my summary would be this:
So the best way to optimize build time is not to create on each build. There are 2 possible solutions to that:
With second approach we can do following:
This way we can have fast and consistent build time for local development too. But our infrastructure will be more complex. |
With your second approach there we can also have the API, and we get the flexibility of heroku. I like that! There's more options to resize images, including using cloudflare and fastly. This can make our infra a tad simpler. For caching the API, there should be a bunch of ways to do it, like a varnish container or just in-process memory. Our options are pretty limitless here. Also: fastly doesn't replicate cached results globally. So no go there. |
Ok, so it's Now vs Heroku custom server. With Now we need to clarify the following:
With Heroku:
any other thoughts? I'm missing anything else in this summary so far? |
I like this approach of extracting the problem which is very specific and well defined to a specialized service.
Yep, since there is still some unfamiliarity with other platforms and possibly not super strong reasons to move out of Heroku I would incline to stay there (also seems like it has more predictable pricing). But I haven't been as involved as the others in this research so I don't think my vote should be weighted equal. This way also there's only 3 (real) votes in this issue, so no possibility for ties 😬 |
You're right about now.sh redirects and cross-region API response caching. As per the trailing slash, I can confirm. You can curl this yourself and see the trailing slash disappear. I think heroku and an image resizer might be our best option here. In terms of flexibility, we're able to deploy pretty much anything we want and cache things properly. If the heroku slug size limit hits us, we can choose to not deploy the images there and use S3 or simply raw.githubusercontent.com/branch/path/to/image as a target for our image resize service. It doesn't use the right content-type but otherwise it works and will work for branch preview as well. |
@iAdramelk Heroku it is? |
@shcheklein @fabiosantoscode looks like Heroku. I don't really like idea of our own server running, but looks like we don't have a choice for API and redirects. |
@fabiosantoscode let's proceed with Heroku, we can start by moving blog as an example. |
I got my fork deployed on heroku: https://dvc-blog-production.herokuapp.com/ Trailing slashes get removed, no redirects yet (but will be simple to do, will get to it later). Build timeThere's no timer on heroku builds but using a stopwatch I got 3m05s, 20s of which is the time Heroku takes to pick up the build (might be larger today because they seem to have an ongoing incident), and there's also 60s of "pruning devDependencies", which is basically ensuring no devDependencies end up in production. I find it weird that it takes so much time to do. The public folder is cached, as well as the gatsby folder. And for some reason the slug size is small even though there are so many images. Tests & Types & LintsWe have 2 choices here:
I'm going for 2, unless anyone has any objections. I won't start right now (it's late here) but tomorrow if nobody's said anything I'll configure heroku to run the tests. |
can we get rid of this> Gatsby built time is ~7s which is great. 2 sounds good. |
The reason for this is explained in this issue. TL;DR yarn rebuilds the prod dependencies from source when removing devDependencies. I tried switching to npm after seeing the issue, about pruning dev dependencies in heroku when there's statically built dependencies. However it only shaved off around 10 seconds. So I tried upgrading to yarn 2, which removes the install step and node_modules. I was super excited about it until I found out gatsby doesn't support this yet. All we can do is try to trim some of the devDependencies (kill typescript, anyone? The compilation step and plugins can be replaced with JSDoc comments, plus a typechecker like tern.js or typescript itself) So I'm going to move on. |
TestsApparently they were already working for PRs and master, so I moved on. Preview environmentsThey work, but the first build of each PR takes a very long time. This is because it refuses to use cache. However when running tests, the cache is used properly. This means that opening a new PR means you need to wait for more than 10 minutes to see your preview. The cache issue also happens in the dvc.org repo, however since it doesn't compress a ton of images that's not a problem. I think the obvious solution here is to compress images on demand, as mentioned above, which has the nice side effect of speeding up the build process further, since there will be no need to cache those images. But I feel like it's not part of this initial PR. So I will clean things up and issue a PR. |
what do you by running tests precisely? |
The tests are in a separate pipeline from deploys, where the cache is respected. Yes, more commits on top of the same PR are built quickly (3min). |
Closing this as we moved to Heroku. |
UPDATE: See summary of options in #992 (comment)
This is a ticket to discuss and compare possible solutions, based on the criteria listed below.
We plan to convert dvc.org to
gatsby
and merge it with the blog. Right now, dvc.org is hosted on Heroku, and the blog is hosted on Netlify. We need to choose one hosting that will work for both of those services' needs.What we want:
API endpoints
dvc.org and blog both use some custom API endpoints to fetch and transform data from github and discourse. We also need to be able to cache the results of these requests because they are not very fast and shouldn't be updated more often than once in 15 minutes.
Right now, this is implemented as Node.js server on Heroku with in-memory cache, but we wouldn't necessarily have a server after we migrate to
gatsby
. Also, our current implementation has a problem - for the first user that will try to access the page until the cache is created, load time can be quite significant (~10s). Ideally, we should perform cache update ourselves with something like cron and always sent cached results for all of our users.To solve this, we can use Netlify functions, Cloudflare workers, or something else.
Redirects
We have a large list of redirects that we need to support https://github.com/iterative/dvc.org/blob/master/redirects-list.json
New hosting should allow it.
Build time
Our current build time for the blog on Netlify is long, 7m now, and will only get longer after we merge it with the main site. We can speed it up by preserving
.cache
dir, yarn module cache between builds. A new hosting/build option should allow us to preserve them.Demo stands
Right now, both Netlify and Hero allow us to automatically created preview stands from github PRs. We want to have this functionality in the future too.
--
Our new hosting solution may be not one server, but a combination of the few different ones, e. g. CircleCI + Cloudflare Workers + Netlify/Now.sh, but it should be able to all of the things listed above.
The text was updated successfully, but these errors were encountered: