Editorial Metadata API endpoint

In situations where Marfeel’s crawlers face challenges in fetching metadata due to paywalls, registration requirements, or sensitivity concerns, publishers can opt for the Editorial Metadata API.

The Editorial Metadata API offers a seamless way to integrate article details directly through the API. Anytime a new article is published or updated the endpoint should be invoked with fresh metadata.

When possible it might be easier and more convenient to whitelist the Marfeel Crawlers allowing them to crawl private content.

To use this API you should contact your account manager to get the proper credentials.

The use of the Editorial Metadata API is discouraged for public content where on-page metadata should be used.

Query Parameters

secret The authentication secret provided by your account manager.
metadata Within the metadata object, you will find the data that needs to be added or updated in your article. Be sure to include the canonical URL of the article you wish to alter.

The metadata object

The metadata parameter should be a JSON object with the following schema:

canonical_url Canonical url of the article
urls In case you want to group multiple URLs under the main canonical, you can add those URLs here.
page_type post
title Page title.
image_url Article’s main image url
pub_date_tmsp Publication date, formatted as an ISO 8601 UTC timezone string.
section Section of the page, you can add multiple sections as an array
authors Author of the page, you can add multiple authors as an array
tags The list of tags associated with the article. Use the : notation for tag groups

CURL Example 1

Submit the metadata for the article https://www.yourdomain.com/your-canonical-url/

curl --location 'https://metadata.newsroom.bi/metadata/posts' \
--header 'Content-Type: application/json' \
--data '{
    "apikey": "YourAPIKey",
    "secret": "YourSecret",
    "metadata": {
        "pub_date_tmsp": "2024-05-30T07:58:00+02:00",
        "title": "The title that you want",
        "tags": [
            "TagGroup1:theTag"
        ],
        "authors": [
            "Author one"
        ],
        "canonical_url": "https://www.yourdomain.com/your-canonical-url/",
        "urls": [
        ],
        "page_type": "post",
        "section": "The section"
    }
}'

CURL Example 2

In situations where you don’t have the canonical URL to track in your Native SDKs you can submit the alternate url to guarantee a proper canonicalization of the article.

Let’s assume you want to track a page view hit on distribution.platform.com/path/article with a publisher.com/a/different/path/to/article canonical.

When distribution.platform.com/path/article is published, invoke the Editorial Metadata Endpoint providing both the canonical and the platform specific alternative url. After this, when non-canonical URLs are tracked from Native SDKs they’ll be immediately canonicalized.

curl --location 'https://metadata.newsroom.bi/metadata/posts' \
--header 'Content-Type: application/json' \
--data '{
    "apikey": "YourAPIKey",
    "secret": "YourSecret",
    "metadata": {
        "canonical_url": "https://www.nacion.com/el-pais/salud/empresa-que-opera-con-dos-empleados-y-desde-una/LKASIXEVIZBVRBBP6CX4QZFAQU/story/",
        "urls": [
        	"https://lanaciondecostarica.pressreader.com/article/6880039408856498"
        ],
        "page_type": "post"
    }
}'

Technical details

Here’s a list of aspects to bear in mind:

  1. Metadata informed via the API takes precedence over crawled metadata. API informed fields override crawled fields.
  2. If after invoking the endpoint the Marfeel crawls the page again, fields informed via API will override newly crawled metadata.
  3. When a metadata field has been informed via the metadata API the crawler inspector will show it.