XCube下载XCube源代码下载

XCUBE：使用稀疏体素层次结构的大规模3D生成模型

XCUBE

XCUBE：使用稀疏体素层次结构的大规模3D生成模型
Xuanchi Ren，Jiahui Huang，Xiaohui Zeng，Ken Museth，Sanja Fidler，Francis Williams
纸，项目页面

摘要：我们提出了Xcube（缩写为x ³ ），这是一种具有任意属性的高分辨率稀疏3D体素网格的新生成模型。我们的模型可以以馈送方式产生数百万的体素，最高可达1024 ³的最高有效分辨率，而无需耗时的测试时间优化。为了实现这一目标，我们采用了层次体素潜在扩散模型，该模型使用基于高效的VDB数据结构构建的自定义框架以粗到1的方式逐渐生成更高的分辨率网格。除了产生高分辨率对象外，我们还展示了Xcube在100m x 100m的大型室外场景上的有效性，体素尺寸小至10厘米。我们观察到过去方法明确的定性和定量改进。除了无条件的生成外，我们还表明我们的模型可用于求解各种任务，例如用户指导的编辑，单个扫描的场景完成以及文本到3D。

有关业务查询，请访问我们的网站并提交表格：NVIDIA研究许可。有关与模型有关的任何其他问题，请联系Xuanchi或Jiahui。

消息

2024-10-27：查看我们的Neurips 2024工作Scube，该工作SCUBE在大规模场景重建上延伸XCUBE！
2024-06-18：发布的代码和模型！

环境设置

请注意，我们目前仅支持Linux。我们欢迎支持其他平台。

（可选）安装Libmamba，以改善Conda时的生活质量

 conda update -n base conda
conda install -n base conda-libmamba-solver
conda config --set solver libmamba

康达环境

 # Clone the repository
git clone [email protected]:nv-tlabs/XCube.git
cd XCube

# Create conda environment
conda env create -f environment.yml
conda activate xcube

# Install fVDB (3D learning framework; require GPU later than Ampere)
git clone https://github.com/AcademySoftwareFoundation/openvdb.git
cd openvdb
git fetch origin pull/1808/head:feature/fvdb
git checkout feature/fvdb
rm fvdb/setup.py && cp ../assets/setup.py fvdb/
cd fvdb && pip install .
cd ../..

# Mesh extraction
cd ext/nksr-cuda
python setup.py develop
cd ../..

Docker图像

对于Docker用户，我们建议使用此处的基本图像，并在其上应用上述CONDA设置。

Quickstart

从Google Drive下载验证的检查点，并将其放在checkpoints下。另外，我们提供一个可以自动为您下载所有内容的脚本（暂时不可用）：

 python inference/download_pretrain.py

变形推理：

 # Chair
python inference/sample_shapenet.py none --category chair --total_len 20 --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Car
python inference/sample_shapenet.py none --category car --total_len 20 --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Plane
python inference/sample_shapenet.py none --category plane --total_len 20 --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Visualize
python visualize_object.py -p results/{YOUR_PATH} -i {YOUR_ID}

Waymo推理：

 # Unconditional sampling
python inference/sample_waymo.py none --total_len 20 --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Single-scan condition (coming soon)

# Visualize
python visualize_scene.py -p results/{YOUR_PATH} -i {YOUR_ID}

objaverse推断：

 # Text to 3D
python inference/sample_objaverse.py none --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Visualize
python visualize_object.py -p results/{YOUR_PATH} -i {YOUR_ID}

已发布的代码与本文中描述的版本有所不同：
对于清洁代码，省略了细化网络，这可能会导致结果略有差异，但是这些差异并不显着。
网状提取过程已从VAE转移到后处理。

我们已经在XCube Misc准备了有关数据准备和有用技巧的详细说明。

训练

数据下载链接：

Shapenet：可在此处获得数据。将提取的文件夹作为../data/shapenet 。或者，您可以在配置中更改_shapenet_path 。
Waymo：即将来临

（粗）阶段1

培训自动编码器模型：

 # ShapeNet chair
python train.py ./configs/shapenet/chair/train_vae_16x16x16_dense.yaml --wname 16x16x16-kld-0.03_dim-16 --max_epochs 100 --cut_ratio 16 --gpus 8 --batch_size 32

# ShapeNet car
python train.py ./configs/shapenet/car/train_vae_16x16x16_dense.yaml --wname 16x16x16-kld-0.03_dim-16 --max_epochs 100 --cut_ratio 16 --gpus 8 --batch_size 32

# ShapeNet plane
python train.py ./configs/shapenet/plane/train_vae_16x16x16_dense.yaml --wname 16x16x16-kld-0.03_dim-16 --max_epochs 100 --cut_ratio 16 --gpus 8 --batch_size 32

# Waymo uncond
python train.py ./configs/waymo/train_vae_32x32x32_dense.yaml --wname 32x32x32-kld-0.03_dim-8 --max_epochs 50 --gpus 8 --batch_size 32 --eval_interval 1

训练潜在扩散模型：

 # ShapeNet chair
python train.py ./configs/shapenet/chair/train_diffusion_16x16x16_dense.yaml --wname 16x16x16_kld-0.03 --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 4

# ShapeNet car
python train.py ./configs/shapenet/car/train_diffusion_16x16x16_dense.yaml --wname 16x16x16_kld-0.03 --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 4

# ShapeNet plane
python train.py ./configs/shapenet/plane/train_diffusion_16x16x16_dense.yaml --wname 16x16x16_kld-0.03 --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 4

# Waymo uncond
python train_auto.py ./configs/waymo/train_diffusion_32x32x32_dense.yaml --wname 32x32x32_kld-0.03 --eval_interval 1 --gpus 8 --batch_size 16 --accumulate_grad_batches 4 --save_topk 2

（罚款）阶段2

培训自动编码器模型：

 # ShapeNet chair
python train.py ./configs/shapenet/chair/train_vae_128x128x128_sparse.yaml --wname 512_to_128-kld-1.0 --max_epochs 100 --gpus 8 --batch_size 8 --accumulate_grad_batches 2

# ShapeNet car
python train.py ./configs/shapenet/car/train_vae_128x128x128_sparse.yaml --wname 512_to_128-kld-1.0 --max_epochs 100 --gpus 8 --batch_size 8 --accumulate_grad_batches 2

# ShapeNet plane
python train.py ./configs/shapenet/plane/train_vae_128x128x128_sparse.yaml --wname 512_to_128-kld-1.0 --max_epochs 100 --gpus 8 --batch_size 8 --accumulate_grad_batches 2

# Waymo uncond
python train.py ./configs/waymo/train_vae_256x256x256_sparse.yaml --wname 1024_to_256-kld-0.3 --max_epochs 50 --gpus 8 --batch_size 8 --accumulate_grad_batches 2

训练潜在扩散模型：

 # ShapeNet chair
python train.py ./configs/shapenet/plane/train_diffusion_128x128x128_sparse.yaml --wname 128x128x128_kld-1.0_normal_cond --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 8 --save_topk 2 --save_every 30

# ShapeNet car
python train.py ./configs/shapenet/car/train_diffusion_128x128x128_sparse.yaml --wname 128x128x128_kld-1.0_normal_cond --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 8 --save_topk 2 --save_every 30

# ShapeNet plane
python train.py ./configs/shapenet/car/train_diffusion_128x128x128_sparse.yaml --wname 128x128x128_kld-1.0_normal_cond --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 8 --save_topk 2 --save_every 30

# Waymo uncond
python train.py ./configs/waymo/train_diffusion_256x256x256_sparse.yaml --wname 256x256x64_kld-0.3_semantic_cond --eval_interval 1 --gpus 8 --batch_size 8 --accumulate_grad_batches 4 --save_topk 1

此外，您可以手动指定不同的培训设置，以获取适合您需求的模型。普通标志包括：

--wname ：为WANDB LOGGER指定的其他实验名称。
--batch_size ： autoencoder总数的批量数量，每gpu的批次数量进行diffusion 。
--logger_type ：我们默认使用wandb ;也none支持。

执照

引用

 @inproceedings { ren2024xcube ,
    title = { XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies } , 
    author = { Ren, Xuanchi and Huang, Jiahui and Zeng, Xiaohui and Museth, Ken and Fidler, Sanja and Williams, Francis } ,
    booktitle = { Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition } ,
    year = { 2024 }
}