Skip to content

Noman654/dataengineer_prep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 

Repository files navigation

The Data Engineer Prep Guide

Welcome to The Data Engineer Prep Guide, A repository dedicated to helping data engineers prepare for interviews and hone their skills. My goal is to make preparation simple, structured, and collaborative.

Must-Check Resources for Data Engineers 🚀

📘 Top Repositories

💼 Top Resume Example for Data Engineering

🌟 Influential Thought Leaders to Follow

  1. Sumit Mittal – Founder of BigDataBySumit.
  2. Joe Reis – Co-author of Fundamentals of Data Engineering.
  3. Zach Wilson – Data Engineering Specialist.
  4. Shashank Mishra – Data Engineer and Educator.
  5. Gowtham SB – Big Data and Cloud Expert.

Current Focus: Spark

We’re starting with Spark, one of the most essential tools in a data engineer’s toolkit. The repository currently includes practical examples and commonly asked syntax questions to help you revise effectively.

Repository Structure

📂 Spark/  
   └── syntax_practical/  
       └── common_asked_syntax.ipynb
   └── topics_to_focus.md
📄 README.md  

Contents

  • Spark Syntax Practical:
    • A notebook (common_asked_syntax.ipynb) covering frequently used Spark commands and operations.
    • Designed for quick revision and hands-on practice.

Planned Expansion

This repository is a work in progress! Future sections will include:

  • Kafka: Real-time data streaming concepts and hands-on examples.
  • DBT: Data transformations in modern pipelines.
  • SQL: Practice queries and optimization tips.
  • Data Lake: Best practices for data storage and retrieval.

Getting Started

  1. Clone this repository:
    git clone https://github.com/Noman654/data-engineer-prep.git  
  2. Navigate to the Spark folder to start with the provided notebook.
  3. Open the notebook with Jupyter or any compatible tool to explore the syntax examples OR you can directly run notebook using google-collab.

Contributing

We welcome contributions to make this guide comprehensive and beginner-friendly. Here’s how you can help:

  1. Fork the repository.
  2. Create a branch for your updates.
  3. Submit a pull request with your contributions.

What You Can Contribute

  • Add syntax examples or commonly asked questions for Spark.
  • Improve the existing content for clarity or accuracy.
  • Share practical examples for upcoming topics (Kafka, DBT, SQL, etc.).

Let’s Build Together

This repository is for the community, by the community. Whether you’re preparing for interviews or sharing your expertise, let’s collaborate to make data engineering preparation accessible for everyone.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published