Web Scraper Tool

This Web Scraper Tool automates the extraction of product data (titles, prices, and images) from an e-commerce website. The scraped data is stored locally, and product images are downloaded to a specified directory. The tool is built with Python using BeautifulSoup and supports dynamic configuration for scraping.

Features

Scrapes product titles, prices, and images from the target website.
Downloads product images to a local directory.
Handles dynamic or infinite pagination with configurable page limits.
Provides robust error handling for missing data or connectivity issues.
Outputs the scraped data in JSON format for easy reuse.

Requirements

Python 3.7 or higher
Required Python libraries:
- beautifulsoup4
- requests
- fastapi
- redis
- python-dotenv
- uvicorn

To install the dependencies, run:

pip install -r requirements.txt

Running the Application

Start Redis Server

Ensure that Redis is running on your machine. You can start it using:

redis-server

Run the FastAPI Application

The API will be available at http://127.0.0.1:8000. You can start server using:

uvicorn main:app --reload

Usage

All requests to the /scrape endpoint must include the api_key_header header with the correct token.

Example Request Using `curl`:

curl -X POST "http://127.0.0.1:8000/scrape" \
     -H "api_key_header: your_static_token_here" \
     -H "Content-Type: application/json" \
     -d '{"max_pages": 5, "proxy": "http://yourproxy:port"}'

Expected Response:

{
  "scraped_count": 20,
  "updated_count": 5
}

This response indicates that 20 new products were scraped and 5 existing products were updated in the JSON storage.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
config.py		config.py
main.py		main.py
models.py		models.py
notifier.py		notifier.py
requirements.txt		requirements.txt
scrapper.py		scrapper.py
storage.py		storage.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Scraper Tool

Features

Requirements

Running the Application

Start Redis Server

Run the FastAPI Application

Usage

Example Request Using `curl`:

Expected Response:

About

Releases

Packages

Languages

ag1224/Scraping-tool

Folders and files

Latest commit

History

Repository files navigation

Web Scraper Tool

Features

Requirements

Running the Application

Start Redis Server

Run the FastAPI Application

Usage

Example Request Using curl:

Expected Response:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Example Request Using `curl`:

Packages