This repository has been archived by the owner on Aug 14, 2021. It is now read-only.
v0.3.0
Happy November everyone. Took me more than I expected buy finally we are up to date with Readability.js, at least at the moment of writing these lines.
Here are the changelist for this release.
- Merged PR #24. Fixes notice when trying to extract
og:image
- Up to date to commit eb221c5 (2017-10-16), which includes the following changes:
- New tags added to the unlikelyCandidates regex
- Detection and removal of hierarchical separators in titles
- Added more tags to clean after parsing the article (
button
,textarea
,select
, etc.) - New way to detect empty nodes (including a edge case where a node with a
was detected as a node with content) - Better approach to find a top candidate (specially when a top candidate is the only child of a parent node, which allows a more accurate joining of sibling elements)
- Detect text direction (
ltr
orrtl
) - Detect and mark data tables to avoid removing them during final clean up
- Major fixes when scanning and deleting nodes (no need to traverse backwards anymore)
- Node cleaning via regex matches
- Clean table attributes during final clean up.
- Added license
Hopefully you'll find this release useful. Next release will be 1.0.
Don't forget to like, comment, subscribe, follow my Patreon, hit the gym, and call your mom. Make sure you tell your significant other you love him/her/it and if you are alone right now, install Tinder, because that's something I would really like to do but I was already on a relationship when that app was released.
Enjoy!