How does Marfeel detect the Publication and Last Update date of an article?

Marfeel classifies articles into new, recent, or evergreen content based on their publication date. Pages that don’t specify a publication date, like home or section pages, are classified as not editorial content.

The Marfeel editorial crawler utilizes an article’s last update date to decide if a page should be recrawled and its metadata updated.

Marfeel extracts both the publication and last update dates sequentially trying these strategies until one works:

  1. JSON+LD (For more details visit datePublished - Property and dateModified - Property)

    <script type="application/ld+json">
      "@context": "",
      "@type": "NewsArticle",
      "datePublished": "2021-08-01T04:30:00Z",
      "dateModified": "2021-08-01T05:30:00Z"
  2. Meta item property type

    <meta itemprop="datePublished" content="2021-08-01T04:30:00Z" id="date">
    <meta itemprop="dateModified" content="2021-08-01T05:30:00Z" id="date">
  3. Time item property type as datetime

    <time itemprop="datePublished" datetime="2021-08-01T09:00Z">
    <time itemprop="dateModified" datetime="2021-08-01T05:30:00Z">
  4. Time item property type as content

    <time itemprop="datePublished" content="2021-08-01T09:00Z">
    <time itemprop="dateModified" content="2021-08-01T05:30:00Z">
  5. Time item property type as node value

    <time itemprop="datePublished">2021-08-01T09:00Z</time>
    <time itemprop="dateModified">2021-08-01T09:00Z</time>
  6. Meta article type article:published_time and article:modified_time

    <meta property="article:published_time" content="2021-08-01T17:41:45+00:00" />
    <meta property="article:modified_time" content="2021-08-01T17:41:45+00:00" />
More info: Not updating the last update date can impact positioning on Google News and the News Carousel, among other placements. Marfeel tries to create a visual representation on how most likely GoogleBot sees your site and thus doesn’t update the title.

Content Type

Marfeel automatically computes the Content Type attribute of a url based on the detected publication date according to the above chain.

  1. Evergreen: If an article is older than 7 days it’s tagged as evergreen
  2. New: If it’s been published within the last 48 hours
  3. Recent: If it’s publication date is in between the last 2 and last 7 days.
  4. Not Editorial: Any article that doesn’t specify a publication date.

Based on the rules above any home or sections page with a publication date will incorrectly be considered editorial instead of not editorial content.