(also provided Traditional Chinese version document README-CH.md.)
A Spark-based data pipeline transfers data from MySQL to MySQL, using raw SQL syntax for data extraction, transformation, and loading.
- Language: Python
- Data Processing Framework: Spark v3.5.1
edit connection infomation and sql query in mysql_raw_query.yaml
source:
host: "localhost"
port: 9030
database: "database"
user: "user"
password: "password"
query: "SELECT * FROM Table1 LEFT JOIN Table2 ON Table1.table2_id = Table2.id"
sink:
host: "localhost"
port: 9030
database: "database"
user: "user"
password: "password"
table: "Table3"
docker compose up -d