Mohd Nauman Noman654

👋 Hi, I'm Mohd Nauman

🌟 Data Engineer | Python Developer | Problem Solver | Technical Trainer
🔍 Passionate about crafting scalable solutions, optimizing data pipelines, and tackling real-world challenges with data.

🎓 Pursuing a BSc in Programming and Data Science from IIT Madras
📍 Based in Bengaluru, India

👨‍💻 About Me

A Data Engineer and Python Developer with a passion for designing scalable systems to process and manage large-scale data efficiently. I specialize in optimizing backend operations, reducing costs, and improving performance to deliver impactful results.

🌟 What I Bring:

Data Engineering Expertise: Skilled in designing and optimizing ETL pipelines with tools like PySpark, Kafka, and cloud platforms such as AWS and Azure.
Backend Development: Experienced in creating robust web applications using Flask and FastAPI.
Efficiency & Optimization: Consistently deliver solutions that reduce storage costs by 60% and improve process speeds by 90%.

💡 Fun Fact:

I once reduced a Spark pipeline’s processing time from 2 days to just 30 minutes, saving alot in compute costs! 🚀

🛠️ Technologies I Love:

Languages:
Frameworks:
Databases:
Cloud:

📂 Highlighted Projects

Building a Scalable Pipeline for Indic Image Dataset Extraction

Engineered a robust pipeline using Hugging Face Obelics to fetch images and text from over 230 million websites via Common Crawl, focusing on Indic content.
Designed a scalable, distributed architecture to handle high-volume requests and bypass rate limits effectively.
Successfully curated a dataset of 50–100 million images paired with text, enabling advanced applications.

🔗 One-Click Data Download Tool

Developed a tool for seamless data downloads from sources like Hugging Face and Archive.
Optimized for 30–50% faster speeds using parallel processing and reduced I/O overhead.
Capable of downloading terabytes of data in under an hour.

🔗 Malaysia Airlines Forecasting System

Designed and implemented a forecasting pipeline for airline revenue and passenger trends.
Reduced storage costs by 60% and optimized process speeds by 90%.

📈 GitHub Stats

📜 Certifications

📬 Let's Connect

💌 Email: mohdnauman330@gmail.com
💼 LinkedIn: linkedin.com/in/nauman330
📂 GitHub: github.com/Noman654

Provide feedback

Saved searches

Use saved searches to filter your results more quickly