Artwork

Content provided by Kyle Polich. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Kyle Polich or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Flesch Kincaid Readability Tests

20:25
 
Share
 

Manage episode 290309297 series 49487
Content provided by Kyle Polich. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Kyle Polich or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Given a document in English, how can you estimate the ease with which someone will find they can read it? Does it require a college-level of reading comprehension or is it something a much younger student could read and understand?

While these questions are useful to ask, they don't admit a simple answer. One option is to use one of the (essentially identical) two Flesch Kincaid Readability Tests. These are simple calculations which provide you with a rough estimate of the reading ease.

In this episode, Kyle shares his thoughts on this tool and when it could be appropriate to use as part of your feature engineering pipeline towards a machine learning objective.

For empirical validation of these metrics, the plot below compares English language Wikipedia pages with "Simple English" Wikipedia pages. The analysis Kyle describes in this episode yields the intuitively pleasing histogram below. It summarizes the distribution of Flesch reading ease scores for 1000 pages examined from both Wikipedias.

  continue reading

529 episodes

Artwork

Flesch Kincaid Readability Tests

Data Skeptic

5,496 subscribers

published

iconShare
 
Manage episode 290309297 series 49487
Content provided by Kyle Polich. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Kyle Polich or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.

Given a document in English, how can you estimate the ease with which someone will find they can read it? Does it require a college-level of reading comprehension or is it something a much younger student could read and understand?

While these questions are useful to ask, they don't admit a simple answer. One option is to use one of the (essentially identical) two Flesch Kincaid Readability Tests. These are simple calculations which provide you with a rough estimate of the reading ease.

In this episode, Kyle shares his thoughts on this tool and when it could be appropriate to use as part of your feature engineering pipeline towards a machine learning objective.

For empirical validation of these metrics, the plot below compares English language Wikipedia pages with "Simple English" Wikipedia pages. The analysis Kyle describes in this episode yields the intuitively pleasing histogram below. It summarizes the distribution of Flesch reading ease scores for 1000 pages examined from both Wikipedias.

  continue reading

529 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Quick Reference Guide