Search engine indexing audits reference

xavi.marti · October 3, 2022, 7:17am

These audits detect errors or missing features in article HTML that directly affect indexing and ranking in search engine results pages. Each audit below includes its code identifier and a description of the issue it flags.

`canonical` rel link does not match url

code: canonicalMetaMatchesURL

A canonical URL is the URL that Google considers most representative from a set of duplicate pages on your site. This tag informs Google that the current page is not the original and that another URL represents the primary content. The canonical URL may legitimately differ from the page URL, but it is important to double-check that it is correctly implemented.

`canonical` rel link not found

code: canonicalMetaHasURL

A canonical tag tells search engines like Google that the specified URL represents the original copy of a page. Using the canonical tag prevents problems caused by identical or duplicate content appearing on multiple URLs.

The general recommendation is that any URL containing original content should have a canonical tag pointing to itself.

`http-equiv="Content-Type"` tag is not `"text/html; charset=utf-8"`

code: httpEquivEncodingIsValid

This HTML tag defines the page’s content type and character set. Search engines like Google use it to understand the encoding of the page. Make sure that you surround the value of the content attribute with quotes, otherwise the charset attribute may be interpreted incorrectly.

We recommend using Unicode/UTF-8 where possible.

`<description>` not found

code: description_exists

`<title>` has more than 60 characters

code: title_elementContentLength

The HTML <title> tag is one of the most important factors that search engines such as Google take into consideration when ranking content in their search results.

As a general rule, avoid titles that are too long or too short.

The system has detected that some of your titles are longer than 60 characters. This is not necessarily a problem, but shorter titles tend to perform better in search results.

`<title>` not found

code: title_exists

`<title>` tag does not exist or is empty

code: title_elementExistsAndHasContent

The HTML <title> tag is one of the most important factors that search engines such as Google take into consideration when ranking content in their search results.

The system has detected that some of your pages do not have this tag. Make sure to include it on all your pages, especially those that have editorial value or are strategically targeted for search engine optimisation campaigns.

Affiliate or paid link without proper markup

code: affiliate_link_no_rel

The page contains one or more links that look like affiliation or paid links and don’t have a rel=nofollow or rel=sponsored attribute in place.

Links to known affiliation platforms, with UTMs or with a parameter that contains aff are included in this alert.

`<charset>` tag is not `"text/html; charset=utf-8"`

code: charsetEncodingIsValid

This HTML <charset> tag defines the page’s content type and character set. Search engines like Google use it to better understand the encoding of the page. Make sure that you surround the value of the content attribute with quotes, otherwise the charset attribute may be interpreted incorrectly.

We recommend using Unicode/UTF-8 where possible.

Different HTTP protocols found in page links

code: protocol_inconsistencies

Links with both HTTP and HTTPS protocols pointing to the page’s domain have been found in its HTML. This suggests the domain runs on HTTPS but still has links using HTTP protocol. Mixed protocols can confuse search engines and cause the site to be considered insecure.

Followed external link

code: external_link_no_rel

The page contains one or more followed external links. This is not a problem by itself, but double-check that those links are not paid, affiliation, or part of a link exchange or linking schema.

Found HTTP canonical link

code: http_canonical

Search engines use canonical links to index content optimally. On sites running on HTTPS, declaring canonical as HTTP may confuse search engines and negatively impact indexing and ranking.

Invalid elements in head

code: head_contains_valid_elements_only

Google stops reading the head of a page when an unexpected tag is found inside it. Only valid tags should exist inside the head element.

meta name=`description` tag does not exist or is empty.

code: description_metaHasContent

Google and other search engines use the <meta name="description"> tag to generate search results snippets, mainly when they consider it a more accurate description than the on-page content. A meta description tag should inform users with a short, relevant summary of what the page is about.

The system has noticed that some of your pages do not have this tag. Adding a meta description is recommended.

`<meta name="googlebot">` tag contains a wrong directive

code: googlebot_metaHasAllowedDirective

The <meta name="googlebot"> tag provides behavioural guidelines to search engines like Google on how their index should behave.

The system has detected properties of this tag that could be preventing search engines from indexing the content of the specified URLs. This is not necessarily an issue, but we recommend that you double-check the directive values.

meta name=`robots` tag contains a wrong directive

code: robots_metaHasAllowedDirective

The <meta name="robots"> tag provides behavioural guidelines to search engines like Google on how their index should behave.

The system has detected properties of this tag that could be preventing search engines from indexing the content of the specified URLs. This is not necessarily an issue, but we recommend that you double-check the directive values.

Missing `<link>` tag with `rel="alternate"` element for mobile version

code: linkRelAlternativeHasMobileUrl

An alternate version for mobile was found, but it is not declared in the desktop version using a tag with rel="alternate" element.

Missing language attribute

code: lang_exists

For content metrics to be extracted properly, the language of the text must be informed in a lang attribute in the html tag.

Multiple `rel="amphtml"` links found

code: amphtml_isUnique

The <link rel="amphtml"> tag specifies the URL of the AMP version of a page. This tag must be unique per page, with only one amphtml link specified.

The system has found that there are multiple <link rel="amphtml"> tags within some of your URLs.

Multiple `rel="canonical"` links found

code: canonical_isUnique

A canonical tag tells search engines like Google that a specific URL represents the original copy of a page. Using the canonical tag prevents problems caused by identical or duplicate content appearing on multiple URLs.

The system has found that there are multiple <link rel="canonical"> tags within some of your URLs. Only one canonical tag should exist per page.

No `charset` or `http-equiv` tag found

code: encodingTagExists

Although not required, this tag provides additional information about the encoding of an article.

The system has not detected it. Specifying a charset or http-equiv tag is recommended.

No image in `image` with width 1200px or higher found

code: image_contains_big_image

Google recommends that the image within the Article structured data type contains an image that is at least 1200px wide. Meeting this requirement increases the CTR of articles within Google Discover by 3%.

The system has detected articles where this criteria is not met.

Restricted referrer

code: referrer_isNotRestricted

This audit flags pages where the presence of <meta name="referrer" content="origin"/> restricts access to the previously visited page, which the front-end code requires for accurate navigation tracking.

This meta tag forces the browser to consistently report the home page as the referrer. To fix this, modify the referrer policy from “origin” to “strict-origin-when-cross-origin” in the meta tag. This change ensures that the correct previous page information is transmitted, allowing for complete tracking of user navigation.

What do indexing audits check for?

Indexing audits check for errors or missing features in article HTML that may impact search engine indexing and ranking. This includes canonical tag issues, missing or oversized title tags, empty meta descriptions, incorrect robots directives, charset encoding problems, and improper link attributes.

Why does the canonical rel link need to match the page URL?

A canonical URL tells Google which page is the most representative from a set of duplicate pages. If the canonical tag does not match the actual URL, Google may index the wrong version or ignore the page entirely, reducing search visibility.

What happens when the robots or googlebot meta tag contains a wrong directive?

Incorrect directives in the robots or googlebot meta tags can prevent search engines from indexing page content. This means affected URLs may not appear in search results at all, directly reducing organic traffic.

Search engine indexing audits reference

canonical rel link does not match url

canonical rel link not found

http-equiv="Content-Type" tag is not "text/html; charset=utf-8"

<description> not found

<title> has more than 60 characters

<title> not found

<title> tag does not exist or is empty

Affiliate or paid link without proper markup

<charset> tag is not "text/html; charset=utf-8"

Different HTTP protocols found in page links

Followed external link

Found HTTP canonical link

Invalid elements in head

meta name=description tag does not exist or is empty.

<meta name="googlebot"> tag contains a wrong directive

meta name=robots tag contains a wrong directive

Missing <link> tag with rel="alternate" element for mobile version

Missing language attribute

Multiple rel="amphtml" links found

Multiple rel="canonical" links found

No charset or http-equiv tag found

No image in image with width 1200px or higher found

Restricted referrer

`canonical` rel link does not match url

`canonical` rel link not found

`http-equiv="Content-Type"` tag is not `"text/html; charset=utf-8"`

`<description>` not found

`<title>` has more than 60 characters

`<title>` not found

`<title>` tag does not exist or is empty

`<charset>` tag is not `"text/html; charset=utf-8"`

meta name=`description` tag does not exist or is empty.

`<meta name="googlebot">` tag contains a wrong directive

meta name=`robots` tag contains a wrong directive

Missing `<link>` tag with `rel="alternate"` element for mobile version

Multiple `rel="amphtml"` links found

Multiple `rel="canonical"` links found

No `charset` or `http-equiv` tag found

No image in `image` with width 1200px or higher found