Data-Engineering-project2 Data-Engineering-project2
تفاصيل العمل

🚀 I’ve just completed a new Data Engineering task as part of my training journey! The scenario was working as a Data Engineer in one of the biggest markets in Egypt, where the data sources were: 🛒 Products stored in a JSON file (id, name, price) 📊 Transactions stored in multiple CSV files (1,000 transactions per file → starting with 12,000 transactions = 3 days of data) 👥 Customers stored in a SQL Server database table (2,000 customers) 📌 My task was to build and manage an Orders table: (transaction_id, product_name, amount, customer_full_name, transaction_date, branch) Key steps I implemented: 1️⃣ Initial Load → Inserted the first 12,000 transactions into the Orders table. 2️⃣ Delta Load → Inserted the new day’s transactions (incremental insert). 3️⃣ Product Price Update → Applied changes to reflect the updated product price in the Orders table using an UPDATE statement. 4️⃣ Change Data Capture (CDC) → On day 3, transaction files included both modified old transactions and new ones. I used a MERGE statement to update existing records and insert new ones. Skills gained & applied: ✅ ETL processes using Pandas & SQL Server ✅ Data integration from JSON, CSV, and SQL DB ✅ Handling incremental & CDC scenarios in real-world datasets

شارك
بطاقة العمل
تاريخ النشر
منذ شهر
المشاهدات
16
المستقل
طلب عمل مماثل
شارك
مركز المساعدة