DAL Shemagh Detection
Object Detection & Spatial Classification
Public LB
Private LB
Classes Detected
Challenge Overview
An object detection and classification challenge: detect heads and shemaghs in images with bounding boxes, and determine whether the shemagh is worn correctly on the head. The evaluation combines detection accuracy (mAP) with classification F1-score.
Task
Detect heads and shemaghs with bounding boxes, then classify correct shemagh placement
Evaluation
Final Score = 0.5 ร mAP@[0.5:0.95] + 0.5 ร F1-Score
Dataset
651 train images, 842 test images
Key Challenge
Extreme class imbalance. Only 5 out of 651 images had right_place=TRUE (0.8%)
Pipeline Architecture
Hover each step to see details about the inference flow.
Test Image
Input image passed into the detection pipeline.
RF-DETR Large
Fine-tuned RF-DETR Large as the main detector to find heads and shemaghs.
Head Boxes + Shemagh Boxes
Per-class confidence thresholds applied: head at 0.30, shemagh at 0.15 (lower because shemagh is the minority class).
TTA - Horizontal Flip
Flip the image horizontally, run detection again, mirror boxes back, and merge with NMS. Doubles detection chance for unusual orientations.
Spatial Rule-Based Classifier
9 spatial features computed per (head, shemagh) pair. Weighted sum compared against a tuned threshold to classify correct placement.
right_place: True / False
Final binary classification output for each detected shemagh-head pair.
The Spatial Classifier
The most interesting technical challenge: classifying whether a shemagh is correctly worn on the head using only bounding box geometry.
Why not just IoU?
IoU alone was not enough. A correctly worn shemagh can cover a much larger area than the head, which can still produce a low IoU even when the placement is correct. To handle this, I used nine spatial features that describe the geometric relationship between the two boxes.
Computed Spatial Features
Head Coverage
How much of the head box is covered by the shemagh box.
Upper Head Overlap
Measures how much of the upper half of the head is covered, which turned out to be one of the strongest signals.
Shemagh Top Above Head
Checks whether the top of the shemagh box is above or aligned with the top of the head box.
Head Center in Shemagh
Checks whether the head center falls inside the shemagh box.
Horizontal Difference
Normalized distance between the horizontal centers of the two boxes.
Shemagh Below Head
Measures how much of the shemagh extends below the head, which can indicate that it is around the neck rather than worn correctly.
Positive Signals
- Upper-head overlap โฅ 50%
- Upper head overlap โฅ 25%
- Shemagh top above head top
- Head center inside shemagh
- Head coverage โฅ 30%
- Good horizontal alignment
Negative Signals (Penalties)
- Shemagh center below head
- Overlap at chin only
- No upper head overlap
- Minimal head coverage
- Poor horizontal alignment
Key Techniques
RF-DETR Large
Transformer-based detector that outperforms YOLO on small datasets with complex spatial reasoning. Fine-tuned on the competition dataset with class-specific confidence thresholds.
TTA with Horizontal Flip
Shemagh is the minority class (27%). Horizontal flip doubles detection chance for unusual orientations, with NMS to merge overlapping predictions.
Threshold Tuning
Swept score_threshold from 0.20 to 0.70 to maximize F1 on the validation set. With extreme class imbalance (0.8% positives), threshold tuning was critical.
Dataset Insights
Train Images
Test Images
Head Class
Shemagh Class
Extreme Classification Imbalance
right_place TRUE appeared in only 5 out of 651 training images (0.8%). This made the classification sub-task extremely challenging to generalize.
Duplicate Detection
Found trainโtest duplicate pairs using CNN-based similarity, which helped increase confidence on some known samples.
Stratified Validation
Used stratified split to ensure at least 1 positive example in the validation set, critical for meaningful threshold tuning.
Results & Reflection
89.6%
Public Leaderboard
85%
Private Leaderboard
5th Place
โThe rule-based spatial classifier worked surprisingly well, but the extreme lack of positive examples made generalization difficult. With more labeled data, this approach could likely be pushed further.โ