Under Review — Preview materials available. Full dataset & code released upon journal acceptance.
TLS · MEP · Industrial Infrastructure

Industrial3D

A Terrestrial LiDAR Point Cloud Dataset and Cross-Paradigm Benchmark for Industrial Infrastructure

Industrial3D graphical abstract: dataset overview and benchmark framework ⤢ Click to enlarge
612.7M Labeled Points
12 Semantic Classes
13 Dataset Areas
7 Facilities
6mm TLS Resolution
754 Person-hours
215:1 Class Imbalance

See the Data Move

Ground-truth annotated point clouds from four representative rooms. Each color encodes one of the 12 MEP classes. Click any scene to view it full-screen.

TRAIN
Service Gallery Area 2 · 79.6M pts
⤢ Enlarge
TEST
SPH Pump Room Area 12 · test set
⤢ Enlarge
TEST
93m PSU Room Area 6 · test set
⤢ Enlarge
TRAIN
93m Tank Area Area 3 · facility scale
⤢ Enlarge
Explore all scenes & classes →

Overview

Automated semantic understanding of dense terrestrial laser scanning (TLS) point clouds is a prerequisite for Scan-to-BIM, digital twin maintenance, and as-built verification. Yet for operational industrial mechanical, electrical, and plumbing (MEP) facilities, this challenge remains largely unsolved: water-treatment TLS scans exhibit extreme geometric ambiguity, severe occlusion, and extreme class imbalance that architectural benchmarks such as S3DIS and ScanNet cannot adequately represent. We present Industrial3D, a terrestrial LiDAR dataset with 612.7 million expert-labeled points at 6 mm resolution from 20 room scenes, 13 dataset areas, and 7 operational water treatment facilities. At 6.6× the scale of the closest comparable MEP dataset, Industrial3D provides the largest industrial MEP testbed for within-domain scene understanding. We further establish a cross-paradigm benchmark of nine methods across fully supervised, weakly supervised, unsupervised, and foundation-model settings. The best supervised method reaches 55.74% mIoU, whereas zero-shot Point-SAM reaches 15.79%, a 39.95 percentage-point gap that quantifies unresolved domain transfer for industrial TLS data. Analysis attributes this gap to a dual crisis: 215:1 statistical rarity and cylindrical geometric ambiguity between tail classes and head-class pipes.

What We Offer

Four contributions that advance the state of industrial 3D scene understanding.

🗂

Largest Industrial MEP Dataset

612.7M expert-labeled points at 6 mm TLS resolution across 12 specialized classes, 20 room scenes, and 13 areas from 7 operational water treatment facilities. 6.6× larger than the closest comparable MEP dataset.

🏆

First Cross-Paradigm Benchmark

9 representative methods evaluated across fully supervised, weakly supervised, unsupervised, and foundation model settings under a unified PyTorch codebase with reproducible configurations.

🔬

Systematic Challenge Analysis

Quantifies three persistent industrial challenges: extreme class imbalance, sparse supervision effects, and foundation model domain adaptation gap — connecting results to underlying causes.

📐

Quantified Domain Transfer Gap

39.95 pp gap between best supervised (55.74% mIoU) and zero-shot foundation model (15.79%). 0.1% labels with SQN matches or exceeds full supervision within the same backbone family.

The Dual Crisis

Industrial TLS data exposes two compounding failure modes that architectural benchmarks cannot represent.

215:1
Statistical Rarity
Extreme class imbalance between the head class (rectangular beam) and the rarest tail class (strainer) — 3.5× more severe than S3DIS (62:1) and 2.3× worse than construction-domain datasets. Standard frequency-based re-weighting is insufficient.
86%
Cylindrical Geometric Ambiguity
86% of tail-class points share cylindrical primitive shapes with head-class pipes. Valves, flanges, elbows, and reducers are geometrically confusable with pipes at typical TLS resolution, creating systematic confusion that re-weighting alone cannot resolve.

Best Results Per Paradigm

9 methods evaluated across 4 learning paradigms on Industrial3D test set (Areas 6 + 12, 84.9M points).

Paradigm Best Method Labels mIoU
Fully-Supervised Boundary-CB 100% 55.74%
Weakly-Supervised SQN 0.1% 44.29%
Unsupervised GrowSP 0% 11.73%
Foundation Model Point-SAM (Oracle) 0% (zero-shot) 21.08%
View Full Leaderboard →

7 Water Treatment Facilities

Real operational industrial environments with authentic occlusion, noise, and mechanical complexity.

Industrial water treatment facility: UAV photography and annotated point cloud examples ⤢ Click to enlarge

Facility-scale overview showing UAV photography and color-coded annotated point clouds from operational water treatment facilities in Hong Kong.

Explore the Dataset Download

Authors

Chao Yin
HKUST · Guangzhou Inst. of Geography
Hongzhe Yue
Southeast University
Qing Han
Hengyang Normal University
Difeng Hu
City University of Hong Kong
Zhenyu Liang
HKUST
Fangzhou Lin
HKUST
Bing Sun
HKUST
Boyu Wang
NYU Abu Dhabi
Mingkai Li
National Univ. of Singapore
Wei Yao
Chinese Academy of Sciences (IUE)
Jack C.P. Cheng
HKUST

✉ Corresponding author: Difeng Hu <difenghu@cityu.edu.hk>