In situations where Marfeel’s crawlers face challenges in fetching metadata due to paywalls, registration requirements, or sensitivity concerns, publishers can opt for the Editorial Metadata API.
The Editorial Metadata API offers a seamless way to integrate article details directly through the API. Anytime a new article is published or updated the endpoint should be invoked with fresh metadata.
When possible it might be easier and more convenient to whitelist the Marfeel Crawlers allowing them to crawl private content.
The use of the Editorial Metadata API is discouraged for public content where on-page metadata should be used.
Query Parameters
secret | The authentication secret provided by your account manager. |
metadata | Within the metadata object, you will find the data that needs to be added or updated in your article. Be sure to include the canonical URL of the article you wish to alter. |
The metadata object
The metadata
parameter should be a JSON object with the following schema:
canonical_url | Canonical url of the article |
urls | In case you want to group multiple URLs under the main canonical, you can add those URLs here. |
page_type | post |
title | Page title. |
image_url | Article’s main image url |
pub_date_tmsp | Publication date, formatted as an ISO 8601 UTC timezone string. |
section | Section of the page, you can add multiple sections as an array |
authors | Author of the page, you can add multiple authors as an array |
tags | The list of tags associated with the article. Use the : notation for tag groups |
CURL Example 1
Submit the metadata for the article https://www.yourdomain.com/your-canonical-url/
curl --location 'https://metadata.newsroom.bi/metadata/posts' \
--header 'Content-Type: application/json' \
--data '{
"apikey": "YourAPIKey",
"secret": "YourSecret",
"metadata": {
"pub_date_tmsp": "2024-05-30T07:58:00+02:00",
"title": "The title that you want",
"tags": [
"TagGroup1:theTag"
],
"authors": [
"Author one"
],
"canonical_url": "https://www.yourdomain.com/your-canonical-url/",
"urls": [
],
"page_type": "post",
"section": "The section"
}
}'
CURL Example 2
In situations where you don’t have the canonical URL to track in your Native SDKs you can submit the alternate url to guarantee a proper canonicalization of the article.
Let’s assume you want to track a page view hit on distribution.platform.com/path/article
with a publisher.com/a/different/path/to/article
canonical.
When distribution.platform.com/path/article
is published, invoke the Editorial Metadata Endpoint providing both the canonical and the platform specific alternative url. After this, when non-canonical URLs are tracked from Native SDKs they’ll be immediately canonicalized.
curl --location 'https://metadata.newsroom.bi/metadata/posts' \
--header 'Content-Type: application/json' \
--data '{
"apikey": "YourAPIKey",
"secret": "YourSecret",
"metadata": {
"canonical_url": "https://www.nacion.com/el-pais/salud/empresa-que-opera-con-dos-empleados-y-desde-una/LKASIXEVIZBVRBBP6CX4QZFAQU/story/",
"urls": [
"https://lanaciondecostarica.pressreader.com/article/6880039408856498"
],
"page_type": "post"
}
}'
Technical details
Here’s a list of aspects to bear in mind:
- Metadata informed via the API takes precedence over crawled metadata. API informed fields override crawled fields.
- If after invoking the endpoint the Marfeel crawls the page again, fields informed via API will override newly crawled metadata.
- When a metadata field has been informed via the metadata API the crawler inspector will show it.