Benchmark

Cross-Paradigm Leaderboard

9 methods evaluated across 4 learning paradigms on the Industrial3D test set (Areas 6 + 12, 84.9M points). Metric: mean IoU (mIoU) over 12 classes via vote-based evaluation.

Test Set

Areas 6 + 12

84.9M points, 2 areas from different facilities

Metric

mIoU (12-class)

Mean intersection over union across all classes

Evaluation

Vote-Based

test_smooth = 0.95, multi-pass inference

Split Protocol

Area-Based

S3DIS-style, no room-level leakage

Framework

PyTorch

Unified codebase for all methods; TF methods re-implemented

Val Set

Area 9

15.1M points, used for hyperparameter selection only

Results

All Methods

+ Submit Result

sorted by miou ↓

#	Method	Paradigm	Labels	mIoU (%)	OA (%)	Year	Code / Status
Loading results…

Click column headers to sort. Results verified by the Industrial3D team. OA = overall accuracy. — = not reported.

Per-Class Breakdown

Per-Class IoU (12 Classes)

Vote-based per-class IoU for methods with full results available. Head classes dominate performance; tail classes (Reducer, Strainer, Pump, Elbow, Tee) remain near zero for most methods.

Loading…

Cyan = best per class among listed methods. 0.00 = zero IoU on that class. Only vote-based (smooth test) results shown.

Key Finding

The Domain Transfer Gap

39.95 pp

Best supervised method (Boundary-CB): 55.74% mIoU
Best zero-shot foundation model (Point-SAM One-vs-Rest): 15.79% mIoU

A 39.95 percentage-point gap quantifies the unresolved industrial domain transfer challenge. Even the oracle Point-SAM (21.08%) — given GT mask proposals — trails the best supervised method by 34.66 pp. This gap arises from the dual crisis: foundation models trained on architectural/outdoor data are ill-equipped to distinguish geometrically ambiguous MEP fittings under extreme class imbalance.

Remarkably, SQN with only 0.1% labels (44.29%) outperforms fully-supervised RandLA-Net (39.83%), suggesting that the inductive bias of sparse-label training may be better suited to this domain than dense supervision with vanilla architectures.

Have better results?

Submit your method's results to the leaderboard. We welcome results from all paradigms, including few-shot, semi-supervised, and multi-modal approaches.

Submit Results →