Change language

This article illustrates the various traditional readability formulas available to estimate readability score. Natural language processing sometimes requires analyzing words and sentences to determine the complexity of the text. Readability metrics — these are, as a rule, the grading levels on specific scales that rate the text in relation to the complexity of that particular text. It helps the author to improve the text to make it understandable for a wider audience, which makes the content attractive.

Various methods available for determining the Readabilty / Formaulae score: —

1) Dale - Challa formula
2) Gunning fog formula
4) McLaughlin SMOG formula
5) FORECAST formula
7) Flash Points

The implementation of the readability formulas is shown below.
Dale Chall’s formula

To apply the formula:

Select multiple 100 word swatches throughout the text.
Calculate the average length of a sentence in words (divide the number of words by the number of sentences).
Calculate the percentage of words NOT in Dale-Chall’s 3000 simple word list.
Calculate this equation

` Raw score = 0.1579 * (PDW) + 0.0496 * (ASL) + 3.6365 Here, PDW = Percentage of difficult words not on the Dale – Chall word list. ASL = Average sentence length `

Gunning Mist Formula

` Grade level = 0.4 * ((average sentence length) + (percentage of Hard Words)) Here, Hard Words = words with more than two syllables. `

Smog Formula

` SMOG grading = 3 + √ (polysyllable count). Here, polysyllable count = number of words of more than two syllables in a sample of 30 sentences. `

Flash Formula

` Reading Ease score = 206.835 - (1.015 × ASL) - (84.6 × ASW) Here, ASL = average sentence length (number of words divided by number of sentences) ASW = average word length in syllables (number of syllables divided by number of words) `

1 Readability formulas measure the level of readership must be in order to read a given text. Thus, the author of the text receives much-needed information to reach his target audience.

2. Know in advance if the target audience can understand your content.

3 . Easy to use.

4. Readable text attracts more audience.

1. Due to With many readability formulas, there is an increasing likelihood of getting wide variations in the results of the same text.

2. Applies math to literature, which is not always a good idea.

3. Can’t measure complexity words or phrases to determine exactly where to fix them.

 ` import ` ` spacy ` ` from ` ` textstat.textstat ` ` import ` ` textstatistics, easy_wo rd_set, legacy_round `   ` # Splits text into sentences using ` ` Segmentation of the Spacy proposal that can ` ` # can be found at https://spacy.io/usage/spacy-101 ` ` def ` ` break_sentences (text): ` ` nlp ` ` = ` ` spacy.load (` ` ’en’ ` `) ` ` ` ` doc ` ` = ` ` nlp (text) ` ` return ` ` doc.sents `   ` # Returns the number of words in the text ` ` def ` ` word_count (text) : ` ` sentences ` ` = ` ` break_sentences (text) ` ` words ` ` = ` ` 0 ` ` for ` ` sentence ` ` in ` ` sentences: ` ` words ` ` + ` ` = ` ` len ` ` ([token ` ` for ` ` token ` ` in ` ` sentence]) ` ` ` ` return ` ` words `   ` # Returns the number of sentences in the text ` ` def ` ` sentence_count (text): ` ` sentences ` ` = ` ` break_sentences (text) ` ` return ` ` len ` ` (sentences) `   ` # Returns the average sentence length ` ` def ` ` avg_sentence_length (text): ` ` words ` ` = ` ` word_count (text) `` sentences = sentence_count (text) average_sentence_length = float (words / sentences) return average_sentence_length   # Textstat is a Python package for calculating statistics # text to determine readability, # complexity and level of configuration of a particular corpus. # The package can be found at https://pypi .python.org / pypi / textstat def syllables_count (word): return textstatistics (). syllable_count ( word)   # Returns the average number of syllables per # word in the text def avg_syllables_per_word (text) : syllable = syllables_count (text) words = word_count (text) ASPW = flo at (syllable) / float (words) return legacy_round (ASPW, 1 )    # Return the total number of compound words in the text def difficult_words (text):   # Find all words in the text   words = []   sentences = break_sentences (text) for sentence in sentences: words + = [ str (token) for token in sentence]   # compound words are those that have syllables" = 2 # easy_word_set is provided by Textstat as # list of common words diff_words_set = set ()   for word in words: syllable_count = syllables_count (word) if word not in easy_ word_set and syllable_count" = 2 : diff_words_set.add (word)   return len (diff_words_set)   # A word is polysyllabic if it has more than 3 syllables # this function returns the count of all such words # present in the text def poly_syllable_count (text): count = 0 words = [] sentences = break_sentences (text) for sentence in sentences: words + = [token for token in sentence]     for wo rd in words: syllable_count = syllables_count (word) if syllable_count" = 3 :   count + = 1 return count     def flesch_reading_ease (text): "" " Implements Flesch Formula: Ease of reading = 206.835 - (1.015 × ASL) - (84.6 × ASW) Here, ASL = average sentence length (number of words   divided by the number of sentences) ASW = average word length in syllables (number of syllables divided by the number of words) "" "   FRE = 206.835 - float ( 1.015 * avg_sentence_length (text)) - float ( 84.6 * avg_syllables_per_word (text)) return legacy_round (FRE, 2 )     def gunning_fog (text ): per_diff_words = (difficult_words (text) / word_count (text) * 100 ) + 5 grade = 0.4 * (avg_sentence_length (text) + per_diff_words) return grade     def smog_index (text): "" "   Implements SMOG Formula / Grading SMOG grade = 3 +? Here, number of multi-word words = number of words more   than two syllables in a sample of 30 sentences. "" "   if sentence_count (text)" = 3 : poly_syllab = poly_syllable_count (text) ``  ` ` SMOG ` ` = ` ` (` ` 1.043 ` ` * ` ` (` ` 30 ` ` * ` ` (poly_syllab ` ` / ` ` sentence_count (text))) ` ` * ` ` * ` ` 0.5 ` `) ` ` + ` ` 3.1291 ` ` return ` ` legacy_round (SMOG, ` ` 1 ` `) ` ` else ` `: ` ` return ` ` 0 ` ` `    ` def ` ` dale_chall_readability_score (text): ` ` “” ”` ` ` ` Implements the Dale Challe Formula: ` ` Raw invoice = 0.1579 * (PDW) + 0.0496 * (ASL) + 3.6365 ` ` Here, ` ` ` ` PDW = percentage of difficult words. ` ` ASL = average sentence length ` ` "" "` ` ` ` words ` ` = ` ` word_count (text) ` ` ` ` # Number of words that are not called difficult ` ` ` ` count ` ` = ` ` word_count ` ` - ` ` difficult_words (text) ` ` if ` ` words" ` ` 0 ` `: `   ` # Percentage of words not in the list of difficult words `   ` per ` ` = ` ` float ` ` (count) ` ` / ` ` float ` ` (words) ` ` * ` ` 100 `   ` # diff_words stores the percentage of difficult words ` ` diff_words ` ` = ` ` 100 ` ` - ` ` per ` ` `  ` raw_score ` ` = ` ` (` ` 0.1579 ` ` * ` ` diff_words) ` ` + ` ` ` ` ` ` (` ` 0.0496 ` ` * ` ` avg_sentence_length (text)) ` ` `  ` ` ` # If the percentage of difficult words is more than 5%, then; ` ` # Adjusted grade = Raw grade + 3.6365, ` ` ` ` # otherwise adjusted grade = raw estimate ` ` `  ` if ` ` diff_words" ` ` 5 ` `: `   ` raw_score ` ` + ` ` = ` ` 3.6365 `   ` return ` ` legacy_round (score, ` ` 2 ` `) `

## Shop

Best laptop for Excel

\$

Best laptop for Solidworks

\$399+

Best laptop for Roblox

\$399+

Best laptop for development

\$499+

Best laptop for Cricut Maker

\$299+

Best laptop for hacking

\$890

Best laptop for Machine Learning

\$699+

Raspberry Pi robot kit

\$150

Latest questions

PythonStackOverflow

Common xlabel/ylabel for matplotlib subplots

PythonStackOverflow

Check if one list is a subset of another in Python

PythonStackOverflow

How to specify multiple return types using type-hints

PythonStackOverflow

Printing words vertically in Python

PythonStackOverflow

Python Extract words from a given string

PythonStackOverflow

Why do I get "Pickle - EOFError: Ran out of input" reading an empty file?

PythonStackOverflow

Python os.path.join () method

PythonStackOverflow

Flake8: Ignore specific warning for entire file

## Wiki

Python | How to copy data from one Excel sheet to another

Common xlabel/ylabel for matplotlib subplots

Check if one list is a subset of another in Python

How to specify multiple return types using type-hints

Printing words vertically in Python

Python Extract words from a given string

Cyclic redundancy check in Python

Finding mean, median, mode in Python without libraries