⚠ The Challenge
Only 4,999 images were available from a dataset that normally has 112,000. 14 diseases with extreme class imbalance — rare diseases like Hernia had very few positive samples. First training attempt achieved mean AUC of only 0.52.
💡 The Approach
Weighted binary cross-entropy loss gives rare diseases proportionally more training weight. DenseNet121 transfers well from ImageNet to medical imaging. 2-phase training: freeze base first, then fine-tune last 60 layers.
🔄 Step-by-Step Process
Matched 4,999 X-ray images with multi-label disease annotations from Data_Entry_2017.csv
Analyzed class distribution — some diseases had less than 1% positive rate
Implemented weighted BCE loss: weight = total_samples / (2 × disease_positive_count)
Phase 1: Trained DenseNet121 classification head only — mean AUC improved to ~0.65
Phase 2: Unfroze last 60 DenseNet121 layers for fine-tuning — mean AUC reached 0.72
Deployed Grad-CAM per disease class — radiologists see exactly which region triggered each disease flag
✓ Final Result
Mean AUC 0.72 across 14 diseases on only 4,999 images. Approaches results from published research papers that used 50,000+ images.
📚 Key Lesson
Weighted loss functions are essential for imbalanced medical datasets. Without them, the model ignores rare diseases entirely and still looks like it is performing well on paper.
