5th Place

DAL Shemagh Detection

Object Detection & Spatial Classification

Public LB

Private LB

Classes Detected

RF-DETRPyTorchComputer VisionSpatial Reasoning

Challenge Overview

An object detection and classification challenge: detect heads and shemaghs in images with bounding boxes, and determine whether the shemagh is worn correctly on the head. The evaluation combines detection accuracy (mAP) with classification F1-score.

Task

Detect heads and shemaghs with bounding boxes, then classify correct shemagh placement

Evaluation

Final Score = 0.5 × mAP@[0.5:0.95] + 0.5 × F1-Score

Dataset

651 train images, 842 test images

Key Challenge

Extreme class imbalance. Only 5 out of 651 images had right_place=TRUE (0.8%)

Pipeline Architecture

Hover each step to see details about the inference flow.

🖼️

Test Image

Input image passed into the detection pipeline.

🔍

RF-DETR Large

Fine-tuned RF-DETR Large as the main detector to find heads and shemaghs.

📦

Head Boxes + Shemagh Boxes

Per-class confidence thresholds applied: head at 0.30, shemagh at 0.15 (lower because shemagh is the minority class).

🔄

TTA - Horizontal Flip

Flip the image horizontally, run detection again, mirror boxes back, and merge with NMS. Doubles detection chance for unusual orientations.

📐

Spatial Rule-Based Classifier

9 spatial features computed per (head, shemagh) pair. Weighted sum compared against a tuned threshold to classify correct placement.

✅

right_place: True / False

Final binary classification output for each detected shemagh-head pair.

Head: 0.30Shemagh: 0.15

🖼️

Test Image

Input image passed into the detection pipeline.

🔍

RF-DETR Large

Fine-tuned RF-DETR Large as the main detector to find heads and shemaghs.

📦

Head Boxes + Shemagh Boxes

Per-class confidence thresholds applied: head at 0.30, shemagh at 0.15 (lower because shemagh is the minority class).

Head: 0.30Shemagh: 0.15

🔄

TTA - Horizontal Flip

Flip the image horizontally, run detection again, mirror boxes back, and merge with NMS. Doubles detection chance for unusual orientations.

📐

Spatial Rule-Based Classifier

9 spatial features computed per (head, shemagh) pair. Weighted sum compared against a tuned threshold to classify correct placement.

✅

right_place: True / False

Final binary classification output for each detected shemagh-head pair.

Deep Dive

The Spatial Classifier

The most interesting technical challenge: classifying whether a shemagh is correctly worn on the head using only bounding box geometry.

Why not just IoU?

IoU alone was not enough. A correctly worn shemagh can cover a much larger area than the head, which can still produce a low IoU even when the placement is correct. To handle this, I used nine spatial features that describe the geometric relationship between the two boxes.

Computed Spatial Features

Head Coverage

How much of the head box is covered by the shemagh box.

Upper Head Overlap

Measures how much of the upper half of the head is covered, which turned out to be one of the strongest signals.

Shemagh Top Above Head

Checks whether the top of the shemagh box is above or aligned with the top of the head box.

Head Center in Shemagh

Checks whether the head center falls inside the shemagh box.

Horizontal Difference

Normalized distance between the horizontal centers of the two boxes.

Shemagh Below Head

Measures how much of the shemagh extends below the head, which can indicate that it is around the neck rather than worn correctly.

Positive Signals

Upper-head overlap ≥ 50%
Upper head overlap ≥ 25%
Shemagh top above head top
Head center inside shemagh
Head coverage ≥ 30%
Good horizontal alignment

Negative Signals (Penalties)

Shemagh center below head
Overlap at chin only
No upper head overlap
Minimal head coverage
Poor horizontal alignment

Key Techniques

🔍

RF-DETR Large

Transformer-based detector that outperforms YOLO on small datasets with complex spatial reasoning. Fine-tuned on the competition dataset with class-specific confidence thresholds.

🔄

TTA with Horizontal Flip

Shemagh is the minority class (27%). Horizontal flip doubles detection chance for unusual orientations, with NMS to merge overlapping predictions.

🎯

Threshold Tuning

Swept score_threshold from 0.20 to 0.70 to maximize F1 on the validation set. With extreme class imbalance (0.8% positives), threshold tuning was critical.

Dataset Insights

Train Images

Test Images

Head Class

Shemagh Class

Extreme Classification Imbalance

right_place TRUE appeared in only 5 out of 651 training images (0.8%). This made the classification sub-task extremely challenging to generalize.

Duplicate Detection

Found train↔test duplicate pairs using CNN-based similarity, which helped increase confidence on some known samples.

Stratified Validation

Used stratified split to ensure at least 1 positive example in the validation set, critical for meaningful threshold tuning.

Results & Reflection

89.6%

Public Leaderboard

85%

Private Leaderboard

5th Place

“The rule-based spatial classifier worked surprisingly well, but the extreme lack of positive examples made generalization difficult. With more labeled data, this approach could likely be pushed further.”

View Full Solution on Kaggle

Back to Projects

Brewing ideas, coding intelligence

DAL Shemagh Detection

Challenge Overview

Pipeline Architecture

The Spatial Classifier

Computed Spatial Features

Positive Signals

Negative Signals (Penalties)

Key Techniques

RF-DETR Large

TTA with Horizontal Flip

Threshold Tuning

Dataset Insights

Results & Reflection