python3-readability

Pulls the main body of HTML document and cleans it up
  https://github.com/buriy/python-readability
  0
  no reviews



Given an HTML document, it pulls out the main body text and cleans it up. The module extract the main content of an HTML document, separating meaningful content pertaining the subject from cruft like navigation links, ads, footnotes and sidebar content. This Python module is based off the arc90 ruby readability project.