In this project, I performed an advanced customer segmentation analysis using unsupervised machine learning techniques.
The objective of this project was to discover meaningful patterns in customer booking behavior and group customers into similar segments.
Project Workflow:
• Data Cleaning
- Handling missing values
- Removing duplicates
- Outlier detection using IQR
• Data Preparation
- One-Hot Encoding for categorical variables
- Feature scaling using StandardScaler
• Dimensionality Reduction
- PCA was used to reduce data dimensions and visualize clusters in 2D space.
• Clustering Algorithms
Two clustering algorithms were implemented and compared:
- K-Means
- DBSCAN
• Model Evaluation
The models were evaluated using:
- Silhouette Score
- Davies–Bouldin Index
Results showed that DBSCAN significantly outperformed K-Means in identifying meaningful clusters and detecting noise points.
This project demonstrates practical experience in:
- Data preprocessing
- Unsupervised learning
- Clustering evaluation
- Customer behavior analysis
- Data visualization using Python