Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build fails because of the [false] anchor validation error #1707

Closed
biodranik opened this issue Dec 25, 2021 · 20 comments
Closed

Build fails because of the [false] anchor validation error #1707

biodranik opened this issue Dec 25, 2021 · 20 comments

Comments

@biodranik
Copy link
Contributor

Bug Report

Environment

Zola version: 0.15.2 (mac os x)

Expected Behavior

zola serve should work.

Current Behavior

Error: The anchor in the link `@/_index.md#install` in organicmaps.github.io/content/_index.md does not exist.

Step to reproduce

The markdown source contains a link in the form [Install here](#install).

Neither <p id="install"> nor <a name="install"> in the same markdown file are detected by Zola. It only works with markdown sections started from 2 or more # characters.

biodranik added a commit to organicmaps/organicmaps.github.io that referenced this issue Dec 25, 2021
See getzola/zola#1707

Signed-off-by: Alexander Borsuk <me@alex.bio>
@Keats
Copy link
Collaborator

Keats commented Dec 26, 2021

Neither

nor in the same markdown file are detected by Zola

We are not parsing HTML for anchors in markdown so that's expected.

@biodranik
Copy link
Contributor Author

Well, that's not cool. Especially if you have a multilingual site, where the same ## Title header is translated differently (and has a different #id), and can't be universally referenced.

Also, what about links from shortcodes?

@Keats
Copy link
Collaborator

Keats commented Dec 26, 2021

Especially if you have a multilingual site, where the same ## Title header is translated differently (and has a different #id), and can't be universally referenced.

You would refer to the one from the current language? Or do you mean you would have only anchors in say English for all languages?
The vast majority of people use anchors to refer to headers so it hasn't really been a problem so far.

Also, what about links from shortcodes?

MD shortcodes links would still be checked, HTML ones no.

@biodranik
Copy link
Contributor Author

You would refer to the one from the current language? Or do you mean you would have only anchors in say English for all languages?
The vast majority of people use anchors to refer to headers so it hasn't really been a problem so far.

Example:

This is index.md, it generates `<h1 id="#welcome">Welcome</h1>`

## Welcome

[Link to the Welcome section](#welcome)
This is index.fr.md, it generates `<h1 id="#bienvenue">Bienvenue</h1>`

## Bienvenue

[This link to the Welcome section should be also modified and can not use #welcome](#Bienvenue)

MD shortcodes links would still be checked, HTML ones no.

Any chance to check the whole generated HTML for valid links/ids? It obviously looks wrong: the id is there, but the validator says "no, it's not there".

@Keats
Copy link
Collaborator

Keats commented Dec 28, 2021

It obviously looks wrong: the id is there,

It isn't there in the second case though? As you write, it's a link to an anchor named bienvenue, there are no anchors named welcome in the French version so where would that link go? I'm expecting anchors to be translated but maybe that's only me? Unless I misunderstood the example.

Any chance to check the whole generated HTML for valid links/ids?

Very unlikely, that would have to be opt-in and would likely make zola several times slower

@phillord
Copy link
Contributor

I guess that the slowness would come from a fully correct parse of HTML, but a heuristic parse for id="welcome" would probably do the trick in most cases.

@biodranik
Copy link
Contributor Author

True. A simple substring search should be enough.

Regarding translations: our use case is when users contribute to site translations. And they are not aware that changing h1-h6 headers in Markdown also changes ids.

Btw, is there a markdown way to set a fixed id to headers?

@phillord
Copy link
Contributor

@biodranik Documentation suggests that

# Something manual! {#manual}

should give a manually derived ID which is stable.

@biodranik
Copy link
Contributor Author

Good, a bit better but not ideal.

@Keats
Copy link
Collaborator

Keats commented Dec 31, 2021

a heuristic parse for id="welcome" would probably do the trick in most cases.

We actually do have code in the link checker to check for anchors: https://github.com/getzola/zola/blob/master/components/link_checker/src/lib.rs#L107-L120 but it's only used for external links so it's not that simple. And that function doesn't check for mixed cases

@NiceneNerd
Copy link

NiceneNerd commented Jan 3, 2022

I'm having issues with this too (manually added some anchor tags for footnotes on essays), and I'm confused why this wasn't a problem for me until this week. Did the behaviour change? Now I have 36 broken links and can't build my site, even though it built just a few days ago.

@Keats
Copy link
Collaborator

Keats commented Jan 3, 2022

Can you bisect it? Taking the last 0.14 release that worked for you as the start commit.

@phillord
Copy link
Contributor

phillord commented Jan 4, 2022

I have added a heuristic parser for HTML anchors now in the above PR. I think it is mostly complete in its own terms, just needs squashing and PRing to next.

Would you be interested in this code?

@mwcz
Copy link
Contributor

mwcz commented Jan 6, 2022

I happened to run into this yesterday. The content I'm working with has a lot of hand-built ToCs which link to name and id attributes. I hadn't found this issue and PR yet, so I put together a slapdash fix which just moves the existing check_page_for_anchor to utils::site and then imports it in the two places I saw where anchor links are checked: Page::has_anchor and link_checker.

@Keats I ran a bisect and 51784aa came up. The components/rendering/src/markdown.rs changes in particular look interesting.

(Here's the test case I used for the bisect)

test case: example.md
+++
title = 'Post with anchors'
permalink = '/example'
+++

* [Link to first heading](#first-heading)
* [Link to id](#id-anchor)
* [Link to name](#name-anchor)

## First Heading

Lorem ipsum.

<p name="name-anchor">Name Anchor Target</p>
<p id="id-anchor">ID Anchor Target</p>

@apiraino
Copy link
Contributor

apiraino commented Jan 6, 2022

I've upgraded from 0.13 to 0.15.2 and was caught by surprise by this. I sometimes write links as See you [there](#) as placeholders and populate the hyperlink later so I don't lose concentration when I am writing.

Can't do that anymore :)

@Keats
Copy link
Collaborator

Keats commented Jan 6, 2022

Thanks for the bisect! This commit is recent but I completely forgot about it. I think with @phillord PR we should get back to something reasonable, maybe we can add the tests from @mwcz branch on top.

@mwcz
Copy link
Contributor

mwcz commented Jan 7, 2022

@Keats just in case it wasn't clear, there aren't any new tests in my commit; I just copied the pre-existing tests along with the function when I moved it into utils.

@Absolucy
Copy link

Oh, so that's what's causing this issue. It's preventing me from building my site, as I have a post with a ton of citations at the bottom:

[<sup>\[1\]</sup>](#1)

...

<a id="1">[1]</a> Ha, Anthony. "With Brave Software, JavaScript’s Creator Is Building A Browser For The Ad-Blocked Future." TechCrunch, 20 Jan. 2016, techcrunch.com/2016/01/20/with-brave-software-javascripts-inventor-is-building-a-browser-for-the-ad-blocked-future. \([archive.org](https://web.archive.org/web/20201129075416/https://techcrunch.com/2016/01/20/with-brave-software-javascripts-inventor-is-building-a-browser-for-the-ad-blocked-future/)\)

@Keats
Copy link
Collaborator

Keats commented Jan 21, 2022

You can try the next branch, it should be released this weekend if I can find the time

@Keats Keats closed this as completed Jan 23, 2022
@apiraino
Copy link
Contributor

thanks @Keats :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants