⚠ The Challenge
Goal was demonstrating the power difference between building a CNN from scratch versus using transfer learning — showing exactly why modern AI practitioners almost always choose transfer learning for image classification.
💡 The Approach
Built both approaches side by side: Custom CNN from scratch with 3 convolutional blocks, and MobileNetV2 with last 30 layers fine-tuned. Used identical training conditions for a fair comparison.
🔄 Step-by-Step Process
Loaded 10,000 images (5,000 cats, 5,000 dogs) from Kaggle dataset
Applied data augmentation: flip, rotation, zoom, brightness shifts to prevent overfitting
Custom CNN: 3 conv blocks (32-64-128 filters), max pooling, dropout regularization
MobileNetV2: loaded ImageNet weights, added custom head, fine-tuned last 30 layers
Trained both models for same number of epochs with identical optimizers and batch sizes
Compared with ROC curve, confusion matrix, precision-recall, and training history plots
✓ Final Result
MobileNetV2: 96-98% test accuracy. Custom CNN: ~85%. Transfer learning uses features from 1.4 million ImageNet images — a massive advantage over training from scratch.
📚 Key Lesson
Transfer learning is not a shortcut — it is the correct approach. Starting from ImageNet features means the model already understands edges, textures, and shapes before seeing a single training image.
