Skip to content

Latest commit

 

History

History
30 lines (19 loc) · 1.08 KB

README.md

File metadata and controls

30 lines (19 loc) · 1.08 KB

Anyscrape

Scrape any public data with ease

Purpose

The first step to write a typical web scraper is to identify which element to scrape. As a result, an immense amount of work is done to research the structure of the web. Even worse, some websites have dynamically changing information on html elements or structures that are unfriendly to web scrapers.

The library is intended to circumvent anti-scraping structures, and simplify the process of web scraping by leaving a large room of customization.

To simplify the configuration of the web scraper, anyscrape-reader is a GUI application that allows intuitive element selection, and generation of filters.

Features

Anyscrape

  • Tag name filtering
  • Attribute filtering
  • Location based filtering with linear expressions integrated
  • Delay customization
  • Cookie support

Reader

  • Interactive html element selection
  • Automatic attribue detection and filtering
  • Element location detection
  • Export ability for configurations

For example usage of the library, see examples