For every article Marfeel computes and makes available across the data warehouse all the metrics listed below. These metrics can be used as dimensions for qualitative analysis on the content production from Explore, Optimize or Content Opportunities reports.
Some of the content metrics are inspired by The State of Content Marketing report by Semrush which reveals findings to build winning content strategies based on data.
Given a URL Marfeel automatically extracts its content using Readability which uses a set of heuristics to differentiate the content from UI elements.
Depending on the html markup Readability might incorrectly keep or remove text from an article causing inaccuracies into how the metrics are calculated. You can fine tune Readabilty from
Organization Settings > Editorial Crawler whitelisting different html elements via CSS selectors. Whitelisted elements won’t be removed when processing articles.
Use Editorial crawler to fine tune which elements Marfeel detects as part of the article body.
The H1 length in number of words.
Presence or absence of H2, H3 and H4 headlines. The possible values are:
- No H2
- H2 only
- H2 + H3
- H2 + H3 + H4
Number of images in the article.
Number of lists
<li> for every 500 words of plain text.
Number of videos in the article.
Language of the article as specified in the
<html lang="en"> tag. The language of the article is used to perform language-specific calculations like the number of syllables and can also be used as a dimension to analyze sites with a multi-language strategy.
Number of words
Word count of the article. We internally use phpSyllable which basically splits a text in spaces and counts the parts.
Number of sentences
Number of sentences in the article.
Number of syllables
Number of syllables in the article. We use TEX hyphenation patterns for most of the languages.
In 1946, lawyer, author and writing consultant Rudolph Flesch published a readability formula in his dissertation, “Marks of a Readable Style.” That formula, the Flesch Reading Ease index, was the original Flesch test. The formula is:
206.835 – (1.015 x words per sentence) – (84.6 x syllables per word) = reading ease
Flesch’s work with the Associated Press helped bring the reading level of front-page newspaper stories down by five grade levels. Publishers increased readership by 40% to 60% with the formula.
Scores range from 0 to 100. The higher the score, the easier your message is to read.
Flesch Reading Ease 2
|Score||Level||Words/sentence||Syllables/word||Estimated school grade||% of adults who can read at this level|
|90-100||Very easy||8 or fewer||1.23 or fewer||4th||93|
|60-70||Standard||17||1.47||7th or 8th||83|
|50-60||Fairly hard||21||1.55||Some high school||54|
|30-50||Hard||25||1.67||High school or some college||33|
|0-30||Very hard||29 or more||1.92 or more||College||4.5|
Aim for 60 or higher. To increase your score, reduce the length of your sentences and words.
The wikipedia definition is quite interesting.
The Flesch Index formula uses different content metrics and multipliers. These multipliers change depending on the language. If Marfeel can’t detect the language it falls back to English multipliers.
const FLESCH_SCORE_MULTIPLIERS = array( 'de' => [180, 1, 58.5], 'en' => [206.84, 1.015, 84.6], 'es' => [206.835, 1.015, 60], 'ca' => [206.835, 1.015, 60], 'fr' => [207, 1.015, 73.6], 'it' => [217, 1.3, 60], 'nl' => [206.84, 0.77, 93], 'pt' => [248.835, 1.015, 84.8], 'ru' => [206.835, 1.3, 60.1], ); [$a, $b, $c] = FleschScoreMultipliers::get($language); $fleschIndex = $a - $b * $wordsPerSentence - $c * $syllablesPerWord;
There are several factors that could affect the Flesch Index value compared to other platforms:
- The language multipliers above used in the formula might be different. They are not standardized.
- Make sure Marfeel properly detects the body of the articles.
If the body of an article contains an affiliation link Marfeel will set
hasAffiliation=true based on these link detection rules