-
Notifications
You must be signed in to change notification settings - Fork 510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support home and search pages in S3 #2922
Conversation
This is not a solution is a comment. Since the result is computed in this case the correct url is without training slash and without extension. This allows you to have a hypothetical search folder in the file system with for example the search page inside? 🤔🧂 |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We chatted about this. We/I typed some pseudo code that we agreed is better because it doesn't mention any other things at all.
I made changes and deployed this updated |
I forgot to mention the most important thing! I realized that I needed to add the request's query string (if present) to the redirect that removes the trailing slash. Ready for review again. |
I tried to do a thorough review. I started typing some comments and the more I wrote the more my confidence waned. To back up my comments I typed some real JS code in a Node prompt to assert my statements. Eventually, it was easier to type it into a real temporary .js file. Eventually, it got messy to the point where I wrote down some very basic unit tests. One thing led to another and my messy testing became a big simulation. So here it is: function handler(uri) {
const url = new URL(uri, "http://example.com");
const pathname = url.pathname;
const split = pathname.split("/");
const first = split[1];
if (pathname === "/") {
split.splice(1, 1, "en-US");
if (split.length === 2) {
split.push("");
}
return split.join("/") + url.search;
}
// Is it '/en-Us' or '/ZH-Cn/' or '/Ja/docs/foo'?
if (
VALID_LOCALES.has(first.toLowerCase()) &&
VALID_LOCALES.get(first.toLowerCase()) !== first
) {
if (split.length === 2) {
split.push("");
}
split.splice(1, 1, VALID_LOCALES.get(first.toLowerCase()));
return split.join("/") + url.search;
}
// Is it '/en-US' or '/zh-CN' (just the locale, but lacking trailing /)
if (VALID_LOCALES.has(first.toLowerCase()) && split.length == 2) {
split.push("");
return split.join("/") + url.search;
}
if (pathname.endsWith("/")) {
split.pop();
}
return `S3 LOOKUP: ${split.join("/") + url.search}`;
}
const tests = [
["/", "/en-US/"], // just slash
["/?a=b", "/en-US/?a=b"], // just slash and query string
["/", "/en-US/"], // just slash
["/?a=b", "/en-US/?a=b"], // just slash and query string
["/EN-US", "/en-US/"], // wrong case and no trailing slash
["/EN-US?next=foo", "/en-US/?next=foo"], // wrong case and no trailing slash and query string
["/EN-us/", "/en-US/"], // wrong case
["/EN-us/?a=b", "/en-US/?a=b"], // wrong case and query string
["/EN-us/docs/Foo", "/en-US/docs/Foo"], // wrong case and more
["/en-US", "/en-US/"], // no trailing slash
["/en-US?a=b", "/en-US/?a=b"], // no trailing slash and query string
["/en-US/", "S3 LOOKUP: /en-US"], // perfect
["/en-US/?next=foo", "S3 LOOKUP: /en-US?next=foo"], // perfect and query string
["/en-US/favicon.ico", "S3 LOOKUP: /en-US/favicon.ico"], // perfect
["/en-US/search?q=w", "S3 LOOKUP: /en-US/search?q=w"], // perfect and query string
["/en-US/docs/foo/", "S3 LOOKUP: /en-US/docs/foo"], // trailing slash
["/en-US/docs/foo/?a=b", "S3 LOOKUP: /en-US/docs/foo?a=b"], // trailing slash and query string
];
tests.forEach(([test, expect], i) => {
const got = handler(test);
console.log(
i + 1,
"\t",
test.padEnd(30),
got.startsWith("S3") ? "👍🏼" : "👉",
got
);
if (got !== expect) {
throw new Error("unexpected");
}
console.log();
}); I know that if we just implement that, properly, as proper Lambda'esque code, we'll cover all bases. |
@peterbe That is helpful. Here are some of my thoughts:
Let me adjust this PR. It may not be until Friday with everything else going on. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* We don't need the query string (your `url.search`) for home and document requests (since they're never used), but I guess it doesn't hurt.
Be careful with that. :) We might. Who knows.
It would be nice if loading:
/en-US/docs/Web?q=a
/en-US/docs/Web?q=b
/en-US/docs/Web?q=c
means...:
- Cache MISS
- Cache HIT
- Cache HIT
...in CloudFront but that we still can get these into the rendering.
It's not inconceivable that we uses these for something. For example Google Analytics referral query string. Or some other functionality that we haven't thought of yet.
And we know the /EN-us/search?q=foo
definitely needs it so it'd be something we have to code in support for anyway.
@@ -6,6 +6,16 @@ const { | |||
encodePath, | |||
slugToFolder, | |||
} = require("@yari-internal/slug-utils"); | |||
const { VALID_LOCALES } = require("@yari-internal/constants"); | |||
|
|||
const THIRTY_DAYS = 3600 * 24 * 30; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling it THIRTY_DAYS
means it's no point making it a variable. It's easy enough to read...
...
cacheControlSeconds: 3600 * 24 * 30,
I think a more appropriate name, if it's even needed is; LONG_CACHE_TTL
or something.
// page, not "en-us/index.html", which is what S3 would look for if | ||
// we left the trailing slash. | ||
request.uri = request.uri.slice(0, -1); | ||
} else if (request.uri.endsWith("/")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't do this to requests for /{locale}/account/
since that endpoint (in Kuma) requires a trailing slash. If we do, we get a redirect loop, since Kuma redirects /{locale}/account
to /{locale}/account/
. 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm out of time to solve this today. There are several options available, including routing account requests directly to Kuma, bypassing lambda@edge (but that requires at least 3 new CDN behaviors to handle all of the cases).
Co-authored-by: Peter Bengtsson <peterbe@mozilla.com>
For the record, @peterbe verbally approved me moving forward with this. I've deployed this code to |
* support home and search pages in S3 * feedbacked * ensure redirect includes query string * more redirect cases and tests * feedback and fixes * Update deployer/aws-lambda/content-origin-request/index.js Co-authored-by: Peter Bengtsson <peterbe@mozilla.com> Co-authored-by: Peter Bengtsson <peterbe@mozilla.com>
This provides the necessary support for both the new Yari-based home and search pages. In the main, it does two things:
/{locale}
-->/{locale}/
) as well as search page requests with a trailing slash (e.g.,/{locale}/search/
-->/{locale}/search
). Inconsistent? Yes and no. No, because that's the way it's been for years on MDN and we're remaining consistent with that, but yes in the sense that we don't handle trailing slashes the same across all URLs.en-us
for the English home page, noten-us/index.html
, which is what S3 would look for if we kept the trailing slash.Note. This is already "live" on
dev
,stage
, andprod
, so the review is more about looking for any adjustments not whether it works or not.