-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add robots.txt to specify doc versions to appear in search engines #3291
Conversation
How will this affect URLs on circuitpython.readthedocs.io such as https://circuitpython.readthedocs.io/projects/display_text/en/latest/ -- will search engines still be permitted to index them? The way I read it, they would be forbidden. |
Good point: I mistakenly thought those were under another URL. I'll add a prefix to robots.txt. |
This needs a lot more thought. Converting to draft. https://circuitpython.readthedocs.io/projects, for instance, is not a searchable top-level point. |
Following the rabbit hole on the original issue I linked, I saw a number of discussions involving tags vs robots, so there may be some resources available off that original discussion. |
Where are we at on this? It's marked as blocking 6.0.0 |
I would like to revive this, and have un-drafted it. @sommersoft's old comment may assuage my original concerns. Info from readthedocs: https://docs.readthedocs.io/en/latest/hosting.html#custom-robots-txt-pages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for circling back on this!
I hope it works! We'll have to take a look at the search results after a while. |
It can take a while: https://developers.google.com/search/docs/advanced/crawling/ask-google-to-recrawl |
As of today, I'm still seeing results from 2.x and 3.x as the first result on Google, but it's now hiding the content. Typically the Latest docs are the second result. So... partial success? At least it's now easy to tell from the search results page which links are going to go to Latest, even if the bad links still show up. |
Fixes #3263 in a simple way, adding a static
robots.txt
file. The stable version specified in therobots.txt
needs to be updated when it changes; it's currently5.3.x
. Therobots.txt
could generated automatically, though I'm not sure how off the bat.