Pointcept是一个强大而灵活的代码库,用于点云知觉研究。这也是以下论文的正式实施:
Point Transformer V3:更简单,更快,更强
Xiaoyang Wu,Li Jiang,Peng-Shuai Wang,Zhijian Liu,Xihui Liu,Yu Qiao,Wanli Ouyang,Tong He,Hengshuang Zhao
IEEE计算机视觉和模式识别会议( CVPR )2024-口服
[骨干] [PTV3] - [arxiv] [bib] [project]→这里
OA-CNNS:3D语义分割的全适中稀疏CNN
Bohao Peng,Xiaoyang Wu,Li Jiang,Yukang Chen,Hengshuang Zhao,Zhuotao Tian,Jiaya Jia
IEEE计算机视觉和模式识别会议( CVPR )2024
[骨干] [OA -CNNS] - [arxiv] [bib]→这里
通过多数据集迅速培训进行大规模3D表示学习
Xiaoyang Wu,Zhuotao Tian,Xin Wen,Bohao Peng,Xihui Liu,Kaicheng Yu,Hengshuang Zhao
IEEE计算机视觉和模式识别会议( CVPR )2024
[PRETRAIN] [PPT] - [ARXIV] [BIB]→此处
蒙版场景对比:无监督3D表示学习的可扩展框架
Xiaoyang Wu,Xin Wen,Xihui Liu,Hengshuang Zhao
IEEE计算机视觉和模式识别会议( CVPR )2023
[预处理] [MSC] - [arxiv] [bib]→这里
学习上下文感知的语义分割分类器(3D部分)
Zhuotao Tian,Jiequan Cui,Li Jiang,Xiaojuan Qi,Xin Lai,Yixin Chen,Shu Liu,Jiaya Jia
AAAI人工智能会议( AAAI )2023-口腔
[SEMSEG] [CAC] - [ARXIV] [BIB] [2D PART]→此处
点变压器V2:分组向量注意和基于分区的池
Xiaoyang Wu,Yixing Lao,Li Jiang,Xihui Liu,Hengshuang Zhao
神经信息处理系统会议( NEURIPS )2022
[骨干] [PTV2] - [arxiv] [bib]→这里
点变压器
Hengshuang Zhao,Li Jiang,Jiay Jia,Philip Torr,Vladlen Koltun
IEEE国际计算机视觉会议( ICCV )2021-口头
[骨干] [PTV1] - [arxiv] [bib]→这里
此外, PointCept集成了以下出色的工作(包含上述):
骨干:Minkunet(此处),Spunet(此处),SPVCNN(此处),OACNNS(此处),PTV1(此处)(此处),PTV2(此处),PTV3(此处),StratifiedFormer(there),octformer(octformer),octformer(there),swin3d(there);
语义分割:mix3d(此处),CAC(此处);
实例分段:pointGroup(here);
预训练:PointContrast(此处),对比场景上下文(此处),掩盖场景对比(此处),点及时培训(此处);
数据集:Scannet(此处),Scannet200(此处),Scannet ++(此处),S3DIS(此处),MatterPort3d(there),Arkitscene,structured3d(there),semantickitti(semantickitti(shere),nuscenes(there),nuscenes(there),modelnet40(shere),Waymo(此处),Waymo(此处)。
如果您发现PointCept对您的研究有用,请引用我们的工作作为鼓励。 (੭ˊ꒳ˋ)੭✧
@misc{pointcept2023,
title={Pointcept: A Codebase for Point Cloud Perception Research},
author={Pointcept Contributors},
howpublished = {url{https://github.com/Pointcept/Pointcept}},
year={2023}
}
conda create -n pointcept python=3.8 -y
conda activate pointcept
conda install ninja -y
# Choose version you want here: https://pytorch.org/get-started/previous-versions/
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch -y
conda install h5py pyyaml -c anaconda -y
conda install sharedarray tensorboard tensorboardx yapf addict einops scipy plyfile termcolor timm -c conda-forge -y
conda install pytorch-cluster pytorch-scatter pytorch-sparse -c pyg -y
pip install torch-geometric
# spconv (SparseUNet)
# refer https://github.com/traveller59/spconv
pip install spconv-cu113
# PPT (clip)
pip install ftfy regex tqdm
pip install git+https://github.com/openai/CLIP.git
# PTv1 & PTv2 or precise eval
cd libs/pointops
# usual
python setup.py install
# docker & multi GPU arch
TORCH_CUDA_ARCH_LIST= " ARCH LIST " python setup.py install
# e.g. 7.5: RTX 3000; 8.0: a100 More available in: https://developer.nvidia.com/cuda-gpus
TORCH_CUDA_ARCH_LIST= " 7.5 8.0 " python setup.py install
cd ../..
# Open3D (visualization, optional)
pip install open3d该预处理支持ScanNet20 , ScanNet200和ScanNet Data Efficient语义和实例分割。
下载扫描仪V2数据集。
为RAW扫描板运行预处理代码,如下所示:
# RAW_SCANNET_DIR: the directory of downloaded ScanNet v2 raw dataset.
# PROCESSED_SCANNET_DIR: the directory of the processed ScanNet dataset (output dir).
python pointcept/datasets/preprocessing/scannet/preprocess_scannet.py --dataset_root ${RAW_SCANNET_DIR} --output_root ${PROCESSED_SCANNET_DIR}(可选)下载扫描仪数据有效文件:
# download-scannet.py is the official download script
# or follow instructions here: https://kaldir.vc.in.tum.de/scannet_benchmark/data_efficient/documentation#download
python download-scannet.py --data_efficient -o ${RAW_SCANNET_DIR}
# unzip downloads
cd ${RAW_SCANNET_DIR} /tasks
unzip limited-annotation-points.zip
unzip limited-reconstruction-scenes.zip
# copy files to processed dataset folder
mkdir ${PROCESSED_SCANNET_DIR} /tasks
cp -r ${RAW_SCANNET_DIR} /tasks/points ${PROCESSED_SCANNET_DIR} /tasks
cp -r ${RAW_SCANNET_DIR} /tasks/scenes ${PROCESSED_SCANNET_DIR} /tasks(替代方案)我们的预处理数据可以直接下载[此处],请在下载之前同意官方许可证。
链接处理的数据集与代码库:
# PROCESSED_SCANNET_DIR: the directory of the processed ScanNet dataset.
mkdir data
ln -s ${PROCESSED_SCANNET_DIR} ${CODEBASE_DIR} /data/scannet # RAW_SCANNETPP_DIR: the directory of downloaded ScanNet++ raw dataset.
# PROCESSED_SCANNETPP_DIR: the directory of the processed ScanNet++ dataset (output dir).
# NUM_WORKERS: the number of workers for parallel preprocessing.
python pointcept/datasets/preprocessing/scannetpp/preprocess_scannetpp.py --dataset_root ${RAW_SCANNETPP_DIR} --output_root ${PROCESSED_SCANNETPP_DIR} --num_workers ${NUM_WORKERS} # PROCESSED_SCANNETPP_DIR: the directory of the processed ScanNet++ dataset (output dir).
# NUM_WORKERS: the number of workers for parallel preprocessing.
python pointcept/datasets/preprocessing/sampling_chunking_data.py --dataset_root ${PROCESSED_SCANNETPP_DIR} --grid_size 0.01 --chunk_range 6 6 --chunk_stride 3 3 --split train --num_workers ${NUM_WORKERS}
python pointcept/datasets/preprocessing/sampling_chunking_data.py --dataset_root ${PROCESSED_SCANNETPP_DIR} --grid_size 0.01 --chunk_range 6 6 --chunk_stride 3 3 --split val --num_workers ${NUM_WORKERS} # PROCESSED_SCANNETPP_DIR: the directory of the processed ScanNet dataset.
mkdir data
ln -s ${PROCESSED_SCANNETPP_DIR} ${CODEBASE_DIR} /data/scannetpp通过填写此Google表格下载S3DIS数据。下载Stanford3dDataset_v1.2.zip文件并解压缩它。
修复Area_5/office_19/Annotations/ceiling Line 323474(103.0.0000 => 103.000000)中的错误错误。
(可选)从此处下载完整的2d-3d S3DIS数据集(no XYZ)以进行正常解析。
为S3DIS运行预处理代码如下:
# S3DIS_DIR: the directory of downloaded Stanford3dDataset_v1.2 dataset.
# RAW_S3DIS_DIR: the directory of Stanford2d3dDataset_noXYZ dataset. (optional, for parsing normal)
# PROCESSED_S3DIS_DIR: the directory of processed S3DIS dataset (output dir).
# S3DIS without aligned angle
python pointcept/datasets/preprocessing/s3dis/preprocess_s3dis.py --dataset_root ${S3DIS_DIR} --output_root ${PROCESSED_S3DIS_DIR}
# S3DIS with aligned angle
python pointcept/datasets/preprocessing/s3dis/preprocess_s3dis.py --dataset_root ${S3DIS_DIR} --output_root ${PROCESSED_S3DIS_DIR} --align_angle
# S3DIS with normal vector (recommended, normal is helpful)
python pointcept/datasets/preprocessing/s3dis/preprocess_s3dis.py --dataset_root ${S3DIS_DIR} --output_root ${PROCESSED_S3DIS_DIR} --raw_root ${RAW_S3DIS_DIR} --parse_normal
python pointcept/datasets/preprocessing/s3dis/preprocess_s3dis.py --dataset_root ${S3DIS_DIR} --output_root ${PROCESSED_S3DIS_DIR} --raw_root ${RAW_S3DIS_DIR} --align_angle --parse_normal(替代性)我们的预处理数据也可以下载[此处](以正常的向量和对齐角度为单位),请在下载之前同意官方许可证。
将处理的数据集链接到代码库。
# PROCESSED_S3DIS_DIR: the directory of processed S3DIS dataset.
mkdir data
ln -s ${PROCESSED_S3DIS_DIR} ${CODEBASE_DIR} /data/s3dis${STRUCT3D_DIR} )中组织所有下载的zip文件。 # STRUCT3D_DIR: the directory of downloaded Structured3D dataset.
# PROCESSED_STRUCT3D_DIR: the directory of processed Structured3D dataset (output dir).
# NUM_WORKERS: Number for workers for preprocessing, default same as cpu count (might OOM).
export PYTHONPATH=./
python pointcept/datasets/preprocessing/structured3d/preprocess_structured3d.py --dataset_root ${STRUCT3D_DIR} --output_root ${PROCESSED_STRUCT3D_DIR} --num_workers ${NUM_WORKERS} --grid_size 0.01 --fuse_prsp --fuse_pano按照Swin3d的指示,我们将25个类别保留在原始40个类别中,频率超过0.001。
(替代方案)我们的预处理数据也可以下载[此处](透视图和全景视图,打开拉链后471.7g),请在下载之前同意官方许可证。
将处理的数据集链接到代码库。
# PROCESSED_STRUCT3D_DIR: the directory of processed Structured3D dataset (output dir).
mkdir data
ln -s ${PROCESSED_STRUCT3D_DIR} ${CODEBASE_DIR} /data/structured3d # download-mp.py is the official download script
# MATTERPORT3D_DIR: the directory of downloaded Matterport3D dataset.
python download-mp.py -o {MATTERPORT3D_DIR} --type region_segmentations # MATTERPORT3D_DIR: the directory of downloaded Matterport3D dataset.
python pointcept/datasets/preprocessing/matterport3d/unzip_matterport3d_region_segmentation.py --dataset_root {MATTERPORT3D_DIR} # MATTERPORT3D_DIR: the directory of downloaded Matterport3D dataset.
# PROCESSED_MATTERPORT3D_DIR: the directory of processed Matterport3D dataset (output dir).
# NUM_WORKERS: the number of workers for this preprocessing.
python pointcept/datasets/preprocessing/matterport3d/preprocess_matterport3d_mesh.py --dataset_root ${MATTERPORT3D_DIR} --output_root ${PROCESSED_MATTERPORT3D_DIR} --num_workers ${NUM_WORKERS} # PROCESSED_MATTERPORT3D_DIR: the directory of processed Matterport3D dataset (output dir).
mkdir data
ln -s ${PROCESSED_MATTERPORT3D_DIR} ${CODEBASE_DIR} /data/matterport3d按照开放式房间的指示,我们将MatterPort3D的类别重新映射到20个语义类别,并增加了天花板类别。
# SEMANTIC_KITTI_DIR: the directory of SemanticKITTI dataset.
# |- SEMANTIC_KITTI_DIR
# |- dataset
# |- sequences
# |- 00
# |- 01
# |- ...
mkdir -p data
ln -s ${SEMANTIC_KITTI_DIR} ${CODEBASE_DIR} /data/semantic_kitti下载官方的Nuscene数据集(带有LIDAR细分),并按照以下方式组织下载的文件:
NUSCENES_DIR
│── samples
│── sweeps
│── lidarseg
...
│── v1.0-trainval
│── v1.0-test为Nuscenes运行信息预处理代码(从OpenPCDET修改)如下:
# NUSCENES_DIR: the directory of downloaded nuScenes dataset.
# PROCESSED_NUSCENES_DIR: the directory of processed nuScenes dataset (output dir).
# MAX_SWEEPS: Max number of sweeps. Default: 10.
pip install nuscenes-devkit pyquaternion
python pointcept/datasets/preprocessing/nuscenes/preprocess_nuscenes_info.py --dataset_root ${NUSCENES_DIR} --output_root ${PROCESSED_NUSCENES_DIR} --max_sweeps ${MAX_SWEEPS} --with_camera(替代方案)我们的预处理Nuscenes信息数据也可以下载[此处](仅处理的信息,仍然需要下载RAW数据集并链接到该文件夹),请在下载之前同意官方许可证。
将RAW数据集链接到处理的Nuscene数据集文件夹:
# NUSCENES_DIR: the directory of downloaded nuScenes dataset.
# PROCESSED_NUSCENES_DIR: the directory of processed nuScenes dataset (output dir).
ln -s ${NUSCENES_DIR} {PROCESSED_NUSCENES_DIR}/raw然后,处理后的Nuscenes文件夹如下组织:
nuscene
| ── raw
│── samples
│── sweeps
│── lidarseg
...
│── v1.0-trainval
│── v1.0-test
| ── info将处理的数据集链接到代码库。
# PROCESSED_NUSCENES_DIR: the directory of processed nuScenes dataset (output dir).
mkdir data
ln -s ${PROCESSED_NUSCENES_DIR} ${CODEBASE_DIR} /data/nuscenes下载官方Waymo数据集(v1.4.3),并按照以下方式组织下载的文件:
WAYMO_RAW_DIR
│── training
│── validation
│── testing安装以下依赖性:
# If shows "No matching distribution found", download whl directly from Pypi and install the package.
conda create -n waymo python=3.10 -y
conda activate waymo
pip install waymo-open-dataset-tf-2-12-0运行预处理代码如下:
# WAYMO_DIR: the directory of the downloaded Waymo dataset.
# PROCESSED_WAYMO_DIR: the directory of the processed Waymo dataset (output dir).
# NUM_WORKERS: num workers for preprocessing
python pointcept/datasets/preprocessing/waymo/preprocess_waymo.py --dataset_root ${WAYMO_DIR} --output_root ${PROCESSED_WAYMO_DIR} --splits training validation --num_workers ${NUM_WORKERS}将处理的数据集链接到代码库。
# PROCESSED_WAYMO_DIR: the directory of the processed Waymo dataset (output dir).
mkdir data
ln -s ${PROCESSED_WAYMO_DIR} ${CODEBASE_DIR} /data/waymomkdir -p data
ln -s ${MODELNET_DIR} ${CODEBASE_DIR} /data/modelnet40_normal_resampled从头开始训练。培训处理基于configs夹中的配置。培训脚本将在实验文件夹中的exp文件夹和备份基本代码中生成实验文件夹。培训配置,日志,张板和检查点也将在培训过程中保存到实验文件夹中。
export CUDA_VISIBLE_DEVICES= ${CUDA_VISIBLE_DEVICES}
# Script (Recommended)
sh scripts/train.sh -p ${INTERPRETER_PATH} -g ${NUM_GPU} -d ${DATASET_NAME} -c ${CONFIG_NAME} -n ${EXP_NAME}
# Direct
export PYTHONPATH=./
python tools/train.py --config-file ${CONFIG_PATH} --num-gpus ${NUM_GPU} --options save_path= ${SAVE_PATH}例如:
# By script (Recommended)
# -p is default set as python and can be ignored
sh scripts/train.sh -p python -d scannet -c semseg-pt-v2m2-0-base -n semseg-pt-v2m2-0-base
# Direct
export PYTHONPATH=./
python tools/train.py --config-file configs/scannet/semseg-pt-v2m2-0-base.py --options save_path=exp/scannet/semseg-pt-v2m2-0-base从检查站恢复培训。如果培训过程被偶然中断,则以下脚本可以从给定检查站恢复培训。
export CUDA_VISIBLE_DEVICES= ${CUDA_VISIBLE_DEVICES}
# Script (Recommended)
# simply add "-r true"
sh scripts/train.sh -p ${INTERPRETER_PATH} -g ${NUM_GPU} -d ${DATASET_NAME} -c ${CONFIG_NAME} -n ${EXP_NAME} -r true
# Direct
export PYTHONPATH=./
python tools/train.py --config-file ${CONFIG_PATH} --num-gpus ${NUM_GPU} --options save_path= ${SAVE_PATH} resume=True weight= ${CHECKPOINT_PATH}在训练过程中,网格采样后对点云进行模型评估(体素化),提供了模型性能的初步评估。但是,为了获得精确的评估结果,测试至关重要。测试过程涉及将密集的点云次采样到一系列体素化点云中,以确保对所有点的全面覆盖。然后对这些子积分进行预测并收集以形成整个点云的完整预测。与简单地绘制/插值预测相比,该方法产生更高的评估结果。此外,我们的测试代码支持TTA(测试时间扩展)测试,这进一步增强了评估性能的稳定性。
# By script (Based on experiment folder created by training script)
sh scripts/test.sh -p ${INTERPRETER_PATH} -g ${NUM_GPU} -d ${DATASET_NAME} -n ${EXP_NAME} -w ${CHECKPOINT_NAME}
# Direct
export PYTHONPATH=./
python tools/test.py --config-file ${CONFIG_PATH} --num-gpus ${NUM_GPU} --options save_path= ${SAVE_PATH} weight= ${CHECKPOINT_PATH}例如:
# By script (Based on experiment folder created by training script)
# -p is default set as python and can be ignored
# -w is default set as model_best and can be ignored
sh scripts/test.sh -p python -d scannet -n semseg-pt-v2m2-0-base -w model_best
# Direct
export PYTHONPATH=./
python tools/test.py --config-file configs/scannet/semseg-pt-v2m2-0-base.py --options save_path=exp/scannet/semseg-pt-v2m2-0-base weight=exp/scannet/semseg-pt-v2m2-0-base/model/model_best.pth可以通过replace.test.test.test_cfg.aug_transform data.test.test_cfg.aug_transform = [...]禁用TTA。
data = dict (
train = dict (...),
val = dict (...),
test = dict (
...,
test_cfg = dict (
...,
aug_transform = [
[ dict ( type = "RandomRotateTargetAngle" , angle = [ 0 ], axis = "z" , center = [ 0 , 0 , 0 ], p = 1 )]
]
)
)
)Offset是批处理数据中点云的分离器,它类似于PYG中Batch的概念。批处理和偏移的视觉说明如下:
Pointcept提供了由SpConv和MinkowskiEngine实施的SparseUNet 。建议使用SPCONV版本,因为SPCONV易于安装,并且比Minkowskiengine更快。同时,SPCONV也广泛应用于户外感知。
代码库中的SPCONV版本SparseUNet已从MinkowskiEngine版本完全重写,示例运行脚本如下:
# ScanNet val
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-0-base -n semseg-spunet-v1m1-0-base
# ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-spunet-v1m1-0-base -n semseg-spunet-v1m1-0-base
# S3DIS
sh scripts/train.sh -g 4 -d s3dis -c semseg-spunet-v1m1-0-base -n semseg-spunet-v1m1-0-base
# S3DIS (with normal)
sh scripts/train.sh -g 4 -d s3dis -c semseg-spunet-v1m1-0-cn-base -n semseg-spunet-v1m1-0-cn-base
# SemanticKITTI
sh scripts/train.sh -g 4 -d semantic_kitti -c semseg-spunet-v1m1-0-base -n semseg-spunet-v1m1-0-base
# nuScenes
sh scripts/train.sh -g 4 -d nuscenes -c semseg-spunet-v1m1-0-base -n semseg-spunet-v1m1-0-base
# ModelNet40
sh scripts/train.sh -g 2 -d modelnet40 -c cls-spunet-v1m1-0-base -n cls-spunet-v1m1-0-base
# ScanNet Data Efficient
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-2-efficient-la20 -n semseg-spunet-v1m1-2-efficient-la20
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-2-efficient-la50 -n semseg-spunet-v1m1-2-efficient-la50
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-2-efficient-la100 -n semseg-spunet-v1m1-2-efficient-la100
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-2-efficient-la200 -n semseg-spunet-v1m1-2-efficient-la200
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-2-efficient-lr1 -n semseg-spunet-v1m1-2-efficient-lr1
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-2-efficient-lr5 -n semseg-spunet-v1m1-2-efficient-lr5
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-2-efficient-lr10 -n semseg-spunet-v1m1-2-efficient-lr10
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-2-efficient-lr20 -n semseg-spunet-v1m1-2-efficient-lr20
# Profile model run time
sh scripts/train.sh -g 4 -d scannet -c semseg-spunet-v1m1-0-enable-profiler -n semseg-spunet-v1m1-0-enable-profiler代码库中的Minkowskiengine版本SparseUNet是根据原始Minkowskiengine repo修改的,并且运行脚本的示例如下:
# Uncomment "# from .sparse_unet import *" in "pointcept/models/__init__.py"
# Uncomment "# from .mink_unet import *" in "pointcept/models/sparse_unet/__init__.py"
# ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-minkunet34c-0-base -n semseg-minkunet34c-0-base
# ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-minkunet34c-0-base -n semseg-minkunet34c-0-base
# S3DIS
sh scripts/train.sh -g 4 -d s3dis -c semseg-minkunet34c-0-base -n semseg-minkunet34c-0-base
# SemanticKITTI
sh scripts/train.sh -g 2 -d semantic_kitti -c semseg-minkunet34c-0-base -n semseg-minkunet34c-0-base引入Omni-Paptive 3D CNN( OA-CNNS ),该网络家庭集成了轻巧的模块,以极大的计算成本极大地增强了稀疏CNN的适应性。如果没有任何自我发场模块,就可以在室内和室外场景中的准确性方面有利地超过点变压器,其潜伏期和内存成本要少得多。与OA-CNNS CAN @pbihao有关的问题。
# ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-oacnns-v1m1-0-base -n semseg-oacnns-v1m1-0-basePTV3是一种有效的骨干模型,可在室内和室外场景中实现SOTA性能。完整的PTV3依赖于闪存的趋势,而Flashingtention依赖于CUDA 11.6及更高版本,请确保您的本地点状态环境满足要求。
如果您无法升级本地环境以满足要求(CUDA> = 11.6),则可以通过将模型参数enable_flash设置为false并将enc_patch_size和dec_patch_size设置为false,并将其禁用flashertention(例如128)。
闪光发挥力禁用RPE并迫使精度降低到FP16。如果需要这些功能,请禁用enable_flash并调整enable_rpe , upcast_attention和upcast_softmax 。
项目存储库可用详细的说明和实验记录(包含权重)。示例运行脚本如下:
# Scratched ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
# PPT joint training (ScanNet + Structured3D) and evaluate in ScanNet
sh scripts/train.sh -g 8 -d scannet -c semseg-pt-v3m1-1-ppt-extreme -n semseg-pt-v3m1-1-ppt-extreme
# Scratched ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
# Fine-tuning from PPT joint training (ScanNet + Structured3D) with ScanNet200
# PTV3_PPT_WEIGHT_PATH: Path to model weight trained by PPT multi-dataset joint training
# e.g. exp/scannet/semseg-pt-v3m1-1-ppt-extreme/model/model_best.pth
sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v3m1-1-ppt-ft -n semseg-pt-v3m1-1-ppt-ft -w ${PTV3_PPT_WEIGHT_PATH}
# Scratched ScanNet++
sh scripts/train.sh -g 4 -d scannetpp -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
# Scratched ScanNet++ test
sh scripts/train.sh -g 4 -d scannetpp -c semseg-pt-v3m1-1-submit -n semseg-pt-v3m1-1-submit
# Scratched S3DIS
sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
# an example for disbale flash_attention and enable rpe.
sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v3m1-1-rpe -n semseg-pt-v3m1-0-rpe
# PPT joint training (ScanNet + S3DIS + Structured3D) and evaluate in ScanNet
sh scripts/train.sh -g 8 -d s3dis -c semseg-pt-v3m1-1-ppt-extreme -n semseg-pt-v3m1-1-ppt-extreme
# S3DIS 6-fold cross validation
# 1. The default configs are evaluated on Area_5, modify the "data.train.split", "data.val.split", and "data.test.split" to make the config evaluated on Area_1 ~ Area_6 respectively.
# 2. Train and evaluate the model on each split of areas and gather result files located in "exp/s3dis/EXP_NAME/result/Area_x.pth" in one single folder, noted as RECORD_FOLDER.
# 3. Run the following script to get S3DIS 6-fold cross validation performance:
export PYTHONPATH=./
python tools/test_s3dis_6fold.py --record_root ${RECORD_FOLDER}
# Scratched nuScenes
sh scripts/train.sh -g 4 -d nuscenes -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
# Scratched Waymo
sh scripts/train.sh -g 4 -d waymo -c semseg-pt-v3m1-0-base -n semseg-pt-v3m1-0-base
# More configs and exp records for PTv3 will be available soon.室内语义细分
| 模型 | 基准 | 其他数据 | 数字 | Val Miou | config | 张板 | EXP记录 |
|---|---|---|---|---|---|---|---|
| PTV3 | 扫描仪 | ✗ | 4 | 77.6% | 关联 | 关联 | 关联 |
| ptv3 + ppt | 扫描仪 | ✓ | 8 | 78.5% | 关联 | 关联 | 关联 |
| PTV3 | Scannet200 | ✗ | 4 | 35.3% | 关联 | 关联 | 关联 |
| ptv3 + ppt | Scannet200 | ✓(ft) | 4 | ||||
| PTV3 | S3DI(区域5) | ✗ | 4 | 73.6% | 关联 | 关联 | 关联 |
| ptv3 + ppt | S3DI(区域5) | ✓ | 8 | 75.4% | 关联 | 关联 | 关联 |
户外语义细分
| 模型 | 基准 | 其他数据 | 数字 | Val Miou | config | 张板 | EXP记录 |
|---|---|---|---|---|---|---|---|
| PTV3 | Nuscenes | ✗ | 4 | 80.3 | 关联 | 关联 | 关联 |
| ptv3 + ppt | Nuscenes | ✓ | 8 | ||||
| PTV3 | Semantickitti | ✗ | 4 | ||||
| ptv3 + ppt | Semantickitti | ✓ | 8 | ||||
| PTV3 | Waymo | ✗ | 4 | 71.2 | 关联 | 关联 | 链接(仅日志) |
| ptv3 + ppt | Waymo | ✓ | 8 |
*释放的型号权重训练为v1.5.1,V1.5.2的权重仍在进行中。
对原始PTV2进行了4 * RTX A6000(48G内存)的培训。即使启用了放大器,原始PTV2的内存成本也略大于24克。考虑到带有24G内存的GPU更容易访问,我在最新的角度调整了PTV2,并在4 * RTX 3090计算机上使其可运行。
PTv2 Mode2启用AMP和禁用位置编码乘数和分组线性。在我们的进一步研究中,我们发现精确的坐标对于点云的理解不是必需的(用网格坐标代替精确的坐标并不影响性能。此外,Sparseunet也是一个例子)。至于分组线性,我的分组线性实现似乎比Pytorch提供的线性层的成本更高。从代码库和更好的参数调整中受益,我们也可以缓解过度拟合的问题。复制性能甚至比本文报告的结果更好。
示例运行脚本如下:
# ptv2m2: PTv2 mode2, disable PEM & Grouped Linear, GPU memory cost < 24G (recommend)
# ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v2m2-0-base -n semseg-pt-v2m2-0-base
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v2m2-3-lovasz -n semseg-pt-v2m2-3-lovasz
# ScanNet test
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v2m2-1-submit -n semseg-pt-v2m2-1-submit
# ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v2m2-0-base -n semseg-pt-v2m2-0-base
# ScanNet++
sh scripts/train.sh -g 4 -d scannetpp -c semseg-pt-v2m2-0-base -n semseg-pt-v2m2-0-base
# ScanNet++ test
sh scripts/train.sh -g 4 -d scannetpp -c semseg-pt-v2m2-1-submit -n semseg-pt-v2m2-1-submit
# S3DIS
sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v2m2-0-base -n semseg-pt-v2m2-0-base
# SemanticKITTI
sh scripts/train.sh -g 4 -d semantic_kitti -c semseg-pt-v2m2-0-base -n semseg-pt-v2m2-0-base
# nuScenes
sh scripts/train.sh -g 4 -d nuscenes -c semseg-pt-v2m2-0-base -n semseg-pt-v2m2-0-base PTv2 mode1是我们在论文中报告的原始PTV2,示例运行脚本如下:
# ptv2m1: PTv2 mode1, Original PTv2, GPU memory cost > 24G
# ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v2m1-0-base -n semseg-pt-v2m1-0-base
# ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v2m1-0-base -n semseg-pt-v2m1-0-base
# S3DIS
sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v2m1-0-base -n semseg-pt-v2m1-0-base原始PTV1也可以在我们的Pointcept代码库中获得。我已经很长时间没有运行PTV1,但是我确保了示例运行脚本效果很好。
# ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-pt-v1-0-base -n semseg-pt-v1-0-base
# ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-pt-v1-0-base -n semseg-pt-v1-0-base
# S3DIS
sh scripts/train.sh -g 4 -d s3dis -c semseg-pt-v1-0-base -n semseg-pt-v1-0-basepip install torch-points3d
# Fix dependence, caused by installing torch-points3d
pip uninstall SharedArray
pip install SharedArray==3.2.1
cd libs/pointops2
python setup.py install
cd ../..# from .stratified_transformer import * in pointcept/models/__init__.py 。 # stv1m1: Stratified Transformer mode1, Modified from the original Stratified Transformer code.
# PTv2m2: Stratified Transformer mode2, My rewrite version (recommend).
# ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-st-v1m2-0-refined -n semseg-st-v1m2-0-refined
sh scripts/train.sh -g 4 -d scannet -c semseg-st-v1m1-0-origin -n semseg-st-v1m1-0-origin
# ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-st-v1m2-0-refined -n semseg-st-v1m2-0-refined
# S3DIS
sh scripts/train.sh -g 4 -d s3dis -c semseg-st-v1m2-0-refined -n semseg-st-v1m2-0-refined SPVCNN是SPVNA的基线模型,它也是室外数据集的实用基线。
# refer https://github.com/mit-han-lab/torchsparse
# install method without sudo apt install
conda install google-sparsehash -c bioconda
export C_INCLUDE_PATH= ${CONDA_PREFIX} /include: $C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH= ${CONDA_PREFIX} /include:CPLUS_INCLUDE_PATH
pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git # SemanticKITTI
sh scripts/train.sh -g 2 -d semantic_kitti -c semseg-spvcnn-v1m1-0-base -n semseg-spvcnn-v1m1-0-baseOctFormer的OctFormer:3D点云的基于OCTREE的变压器。
cd libs
git clone https://github.com/octree-nn/dwconv.git
pip install ./dwconv
pip install ocnn# from .octformer import * in pointcept/models/__init__.py 。 # ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-octformer-v1m1-0-base -n semseg-octformer-v1m1-0-baseSwin3D的Swin3D:经过验证的变压器骨干,可用于3D室内场景。
# 1. Install MinkEngine v0.5.4, follow readme in https://github.com/NVIDIA/MinkowskiEngine;
# 2. Install Swin3D, mainly for cuda operation:
cd libs
git clone https://github.com/microsoft/Swin3D.git
cd Swin3D
pip install ./# from .swin3d import * in pointcept/models/__init__.py 。 # Structured3D + Swin-S
sh scripts/train.sh -g 4 -d structured3d -c semseg-swin3d-v1m1-0-small -n semseg-swin3d-v1m1-0-small
# Structured3D + Swin-L
sh scripts/train.sh -g 4 -d structured3d -c semseg-swin3d-v1m1-1-large -n semseg-swin3d-v1m1-1-large
# Addition
# Structured3D + SpUNet
sh scripts/train.sh -g 4 -d structured3d -c semseg-spunet-v1m1-0-base -n semseg-spunet-v1m1-0-base
# Structured3D + PTv2
sh scripts/train.sh -g 4 -d structured3d -c semseg-pt-v2m2-0-base -n semseg-pt-v2m2-0-base # ScanNet + Swin-S
sh scripts/train.sh -g 4 -d scannet -w exp/structured3d/semseg-swin3d-v1m1-1-large/model/model_last.pth -c semseg-swin3d-v1m1-0-small -n semseg-swin3d-v1m1-0-small
# ScanNet + Swin-L
sh scripts/train.sh -g 4 -d scannet -w exp/structured3d/semseg-swin3d-v1m1-1-large/model/model_last.pth -c semseg-swin3d-v1m1-1-large -n semseg-swin3d-v1m1-1-large
# S3DIS + Swin-S (here we provide config support S3DIS normal vector)
sh scripts/train.sh -g 4 -d s3dis -w exp/structured3d/semseg-swin3d-v1m1-1-large/model/model_last.pth -c semseg-swin3d-v1m1-0-small -n semseg-swin3d-v1m1-0-small
# S3DIS + Swin-L (here we provide config support S3DIS normal vector)
sh scripts/train.sh -g 4 -d s3dis -w exp/structured3d/semseg-swin3d-v1m1-1-large/model/model_last.pth -c semseg-swin3d-v1m1-1-large -n semseg-swin3d-v1m1-1-largeContext-Aware Classifier是一个分段器,可以进一步提高每个骨干的性能,以替代Default Segmentor 。使用以下示例脚本进行培训:
# ScanNet
sh scripts/train.sh -g 4 -d scannet -c semseg-cac-v1m1-0-spunet-base -n semseg-cac-v1m1-0-spunet-base
sh scripts/train.sh -g 4 -d scannet -c semseg-cac-v1m1-1-spunet-lovasz -n semseg-cac-v1m1-1-spunet-lovasz
sh scripts/train.sh -g 4 -d scannet -c semseg-cac-v1m1-2-ptv2-lovasz -n semseg-cac-v1m1-2-ptv2-lovasz
# ScanNet200
sh scripts/train.sh -g 4 -d scannet200 -c semseg-cac-v1m1-0-spunet-base -n semseg-cac-v1m1-0-spunet-base
sh scripts/train.sh -g 4 -d scannet200 -c semseg-cac-v1m1-1-spunet-lovasz -n semseg-cac-v1m1-1-spunet-lovasz
sh scripts/train.sh -g 4 -d scannet200 -c semseg-cac-v1m1-2-ptv2-lovasz -n semseg-cac-v1m1-2-ptv2-lovaszPointGroup是点云实例分割的基线框架。
conda install -c bioconda google-sparsehash
cd libs/pointgroup_ops
python setup.py install --include_dirs= ${CONDA_PREFIX} /include
cd ../..# from .point_group import * in pointcept/models/__init__.py 。 # ScanNet
sh scripts/train.sh -g 4 -d scannet -c insseg-pointgroup-v1m1-0-spunet-base -n insseg-pointgroup-v1m1-0-spunet-base
# S3DIS
sh scripts/train.sh -g 4 -d scannet -c insseg-pointgroup-v1m1-0-spunet-base -n insseg-pointgroup-v1m1-0-spunet-base # ScanNet
sh scripts/train.sh -g 8 -d scannet -c pretrain-msc-v1m1-0-spunet-base -n pretrain-msc-v1m1-0-spunet-base # ScanNet20 Semantic Segmentation
sh scripts/train.sh -g 8 -d scannet -w exp/scannet/pretrain-msc-v1m1-0-spunet-base/model/model_last.pth -c semseg-spunet-v1m1-4-ft -n semseg-msc-v1m1-0f-spunet-base
# ScanNet20 Instance Segmentation (enable PointGroup before running the script)
sh scripts/train.sh -g 4 -d scannet -w exp/scannet/pretrain-msc-v1m1-0-spunet-base/model/model_last.pth -c insseg-pointgroup-v1m1-0-spunet-base -n insseg-msc-v1m1-0f-pointgroup-spunet-basePPT提出了一个多数据集预训练框架,并且与各种现有的训练框架和骨干兼容。
# ScanNet + Structured3d, validate on ScanNet (S3DIS might cause long data time, w/o S3DIS for a quick validation) >= 3090 * 8
sh scripts/train.sh -g 8 -d scannet -c semseg-ppt-v1m1-0-sc-st-spunet -n semseg-ppt-v1m1-0-sc-st-spunet
sh scripts/train.sh -g 8 -d scannet -c semseg-ppt-v1m1-1-sc-st-spunet-submit -n semseg-ppt-v1m1-1-sc-st-spunet-submit
# ScanNet + S3DIS + Structured3d, validate on S3DIS (>= a100 * 8)
sh scripts/train.sh -g 8 -d s3dis -c semseg-ppt-v1m1-0-s3-sc-st-spunet -n semseg-ppt-v1m1-0-s3-sc-st-spunet
# SemanticKITTI + nuScenes + Waymo, validate on SemanticKITTI (bs12 >= 3090 * 4 >= 3090 * 8, v1m1-0 is still on tuning)
sh scripts/train.sh -g 4 -d semantic_kitti -c semseg-ppt-v1m1-0-nu-sk-wa-spunet -n semseg-ppt-v1m1-0-nu-sk-wa-spunet
sh scripts/train.sh -g 4 -d semantic_kitti -c semseg-ppt-v1m2-0-sk-nu-wa-spunet -n semseg-ppt-v1m2-0-sk-nu-wa-spunet
sh scripts/train.sh -g 4 -d semantic_kitti -c semseg-ppt-v1m2-1-sk-nu-wa-spunet-submit -n semseg-ppt-v1m2-1-sk-nu-wa-spunet-submit
# SemanticKITTI + nuScenes + Waymo, validate on nuScenes (bs12 >= 3090 * 4; bs24 >= 3090 * 8, v1m1-0 is still on tuning))
sh scripts/train.sh -g 4 -d nuscenes -c semseg-ppt-v1m1-0-nu-sk-wa-spunet -n semseg-ppt-v1m1-0-nu-sk-wa-spunet
sh scripts/train.sh -g 4 -d nuscenes -c semseg-ppt-v1m2-0-nu-sk-wa-spunet -n semseg-ppt-v1m2-0-nu-sk-wa-spunet
sh scripts/train.sh -g 4 -d nuscenes -c semseg-ppt-v1m2-1-nu-sk-wa-spunet-submit -n semseg-ppt-v1m2-1-nu-sk-wa-spunet-submit # RAW_SCANNET_DIR: the directory of downloaded ScanNet v2 raw dataset.
# PROCESSED_SCANNET_PAIR_DIR: the directory of processed ScanNet pair dataset (output dir).
python pointcept/datasets/preprocessing/scannet/scannet_pair/preprocess.py --dataset_root ${RAW_SCANNET_DIR} --output_root ${PROCESSED_SCANNET_PAIR_DIR}
ln -s ${PROCESSED_SCANNET_PAIR_DIR} ${CODEBASE_DIR} /data/scannet # ScanNet
sh scripts/train.sh -g 8 -d scannet -c pretrain-msc-v1m1-1-spunet-pointcontrast -n pretrain-msc-v1m1-1-spunet-pointcontrast # ScanNet
sh scripts/train.sh -g 8 -d scannet -c pretrain-msc-v1m2-0-spunet-csc -n pretrain-msc-v1m2-0-spunet-cscPointcept由Xiaoyang设计,由Yixing命名,徽标由Yuechen创建。它源自Hengshuang的Semseg,并受到几个存储库,EG,Minkowskiengine,PointNet2,MMCV和DentectRon2的启发。