I recently received a somewhat dramatically worded e-mail from Google:
Googlebot identified a significant increase in the number of URLs on https://blog.atj.me/ that return a 404 (not found) error. This can be a sign of an outage or misconfiguration, which would be a bad user experience. This will result in Google dropping those URLs from the search results. If these URLs don’t exist at all, no action is necessary.
The search for the culprit led to a surprising realization.
Googlebot apparently does not ignore markup in
AMP <template>
elements.
The Problem
If a page contains an AMP <template>
element that renders a link using template data, Google Webmaster tools will
report 404 errors as it tries to follow the “broken” link.
Example:
<template type="amp-mustache">
<a href="{{url}}">{{text}}</a>
</template>
If the above content was served from https://example.com
, Googlebot would try to crawl
https://example.com/%7B%7Burl%7D%7D
because it URL-encodes the literal value of “{{url}}” and treats it like a
relative URL.
It seems a bit silly that the Google web crawler isn’t better equipped to handle a page using Google AMP
but luckily there’s a simple fix.
The Solution
Add rel="nofollow"
to the <a>
tag like so:
<a href="{{url}}" rel="nofollow">{{text}}</a>
This has the side effect of adding the rel
attribute to the actual element that is rendered and inserted to the DOM
but this was not a deal-breaker for my purposes.