Editorial Crawling Troubleshooting

Marfeel creates a visual representation of how Googlebot sees your site, based on its structured data. When a user visits a page and triggers an event to Marfeel, the Editorial Crawler crawls the page and detects, extracts, and audits the structured data and extra metadata of the canonical url, such as the title, author, or section it belongs to.

There might be cases where extracted articles lack editorial information like the title or the author. Instead they show a plain url, such as https://domain.com/path/to/article in the example below:

There are different situations that can cause the Marfeel Editorial Crawler to fail:

  1. WAF or Web Application Firewall. The Marfeel Editorial Crawler follows good citizen practices to throttle the number of concurrent requests per site, but if you have a WAF it might block the Marfeel crawlers. Follow these steps to whitelist them.
  2. URL with a non-existing canonical or without a title or an H1. Marfeel crawls all the information from the informed canonical url. If that fails, then the editorial information won’t be reported correctly.
  3. Yoast in combination with WPRocket cache plugin in Wordpress. Read more about some known issues with this set up.
  4. Detection of external sites. If you see domains that you don’t own, you’ll want to review your canonicals strategy.