Skip to content

Series of scripts to scrap data from websites in order to create a dataset for ML/AI

Notifications You must be signed in to change notification settings

kyaiooiayk/Website-Scrapers

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Web scraping


Python libraries

  • Scrapy is a powerful library to build bots that follow links, retrieve the content, and store the parsed result in a structured way. In combination with the headless browser splash, it can also interpret JavaScript and becomes an efficient alternative to Selenium.
  • Selenium is one of the oldest and perhaps the most widely known tool. Selenium development began as early as 2004. This began as a tool for functional testing and the potential of web scraping was soon realised. The biggest reason for Selenium’s is that it supports writing scripts in multiple programming languages, including Python. It means that you can write Python code to mimic human behavior. The Python script will open the browser, visits web pages, enter text, click buttons, and copy text.
  • beautifulsoup
  • requests

Available tutorials/projects

  • RightMove is a website providing house sell and letting listings
  • Others containts scrapper for Google and BOOKZ

References


About

Series of scripts to scrap data from websites in order to create a dataset for ML/AI

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published