Marfeel classifies articles into new, recent, or evergreen content based on their publication date. Pages that don’t specify a publication date, like home or section pages, are classified as not editorial content.
The Marfeel editorial crawler utilizes an article’s last update date to decide if a page should be recrawled and its metadata updated.
Marfeel extracts both the publication and last update dates sequentially trying these strategies until one works:
-
JSON+LD (For more details visit datePublished - Schema.org Property and dateModified - Schema.org Property)
<script type="application/ld+json"> { "@context": "https://schema.org", "@type": "NewsArticle", "datePublished": "2021-08-01T04:30:00Z", "dateModified": "2021-08-01T05:30:00Z" } </script>
-
Meta item property type
<meta itemprop="datePublished" content="2021-08-01T04:30:00Z" id="date"> <meta itemprop="dateModified" content="2021-08-01T05:30:00Z" id="date">
-
Time item property type as datetime
<time itemprop="datePublished" datetime="2021-08-01T09:00Z"> <time itemprop="dateModified" datetime="2021-08-01T05:30:00Z">
-
Time item property type as content
<time itemprop="datePublished" content="2021-08-01T09:00Z"> <time itemprop="dateModified" content="2021-08-01T05:30:00Z">
-
Time item property type as node value
<time itemprop="datePublished">2021-08-01T09:00Z</time> <time itemprop="dateModified">2021-08-01T09:00Z</time>
-
Meta article type
article:published_time
andarticle:modified_time
<meta property="article:published_time" content="2021-08-01T17:41:45+00:00" /> <meta property="article:modified_time" content="2021-08-01T17:41:45+00:00" />
Content Type
Marfeel automatically computes the Content Type
attribute of a url based on the detected publication date
according to the above chain.
- Evergreen: If an article is older than 7 days it’s tagged as evergreen
- New: If it’s been published within the last 48 hours
- Recent: If it’s publication date is in between the last 2 and last 7 days.
- Not Editorial: Any article that doesn’t specify a publication date.
Based on the rules above any home or sections page with a publication date will incorrectly be considered editorial instead of not editorial content.