Web Scraping & Data Cleaning Using Scrapy-Splash and Pandas
تفاصيل العمل
This project involved extracting educational course data from a dynamic website where content was loaded via JavaScript and served through an API. To overcome the challenges of dynamic rendering, I used Scrapy-Splash , a powerful tool for scraping JavaScript-heavy sites. Key Steps: Data Extraction: scrape data with scrapy-splash and use fake user agents to avoid blocking Clean Data: using pandas to clean data , handle duplicate values ,and handle other columns . Data Storage : save data into a csv file .
مهارات العمل