Skip to content

scrapping data about programming books from different website using X-ray

Notifications You must be signed in to change notification settings

Hamid-abdellaoui/Web_scrapping_with_NodeJs

Repository files navigation

Web scrapping with NodeJs, Using X-ray

About this repository

Scrapping programming books data and stock it in json files separated and ordere by categories and sub categories

More info

The scripts and files in this repo:

main_script.js : scrap the data and store it under sub-directories into the Assets directory

process_scraped_data.js : process the scraped data by reformattinf and adding new insights

AllCategoriesLinks.json : this file contains all the URLs where we will scrapp data

About data

each json file conatain list of books and each books has 5 keys :

"title": the title of book
"picture": small square images of cover
"source" : the source page (wher we scrapped the data of the book)
"link" : a direct link to download book in pdf version
"large_pic" : the cover image of book (good quality)

Wana run it on your computer?

  • First make sure you have Node js installed and run these commands:
npm install
node main_scrap.js
node process_scraped_data.js
  • Alternatively just run this command:
npm run start

About

scrapping data about programming books from different website using X-ray

Topics

Resources

Stars

Watchers

Forks