Skip to content

This repository showcases my data science expertise: extracting Excel data from URLs, cleaning with pandas, storing in PostgreSQL using Python (requests, pandas, psycopg2, SQLAlchemy). Ideal for demonstrating professional data handling skills.

Notifications You must be signed in to change notification settings

kuberkumar07/Data_Extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Project Title: Data Extraction, Cleaning, and Storage in PostgreSQL

Description: This repository showcases a Python-based solution for extracting data from Excel files hosted online, cleaning it, and storing it into PostgreSQL databases. It includes functionalities for robust data handling using pandas, secure data extraction with requests, and efficient database operations via psycopg2 and SQLAlchemy. Ideal for data scientists and analysts looking to automate data pipelines from diverse sources into a structured database format.

Key Features:

Data Extraction: Retrieve Excel data from any accessible URL. Data Cleaning: Remove duplicates, handle missing values, and format numeric data. Database Integration: Seamlessly store cleaned data into PostgreSQL databases. Scalability: Supports multiple Excel sheets, customizable table names, and dynamic data handling. Technologies Used: Python, pandas, requests, psycopg2, SQLAlchemy, PostgreSQL.

Usage: Clone the repository, provide your Excel URL, and execute main.py to automatically clean and store data.

About

This repository showcases my data science expertise: extracting Excel data from URLs, cleaning with pandas, storing in PostgreSQL using Python (requests, pandas, psycopg2, SQLAlchemy). Ideal for demonstrating professional data handling skills.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages