XCube下載XCube源代碼下載

XCUBE：使用稀疏體素層次結構的大規模3D生成模型

XCUBE

XCUBE：使用稀疏體素層次結構的大規模3D生成模型
Xuanchi Ren，Jiahui Huang，Xiaohui Zeng，Ken Museth，Sanja Fidler，Francis Williams
紙，項目頁面

摘要：我們提出了Xcube（縮寫為x ³ ），這是一種具有任意屬性的高分辨率稀疏3D體素網格的新生成模型。我們的模型可以以饋送方式產生數百萬的體素，最高可達1024 ³的最高有效分辨率，而無需耗時的測試時間優化。為了實現這一目標，我們採用了層次體素潛在擴散模型，該模型使用基於高效的VDB數據結構構建的自定義框架以粗到1的方式逐漸生成更高的分辨率網格。除了產生高分辨率對像外，我們還展示了Xcube在100m x 100m的大型室外場景上的有效性，體素尺寸小至10厘米。我們觀察到過去方法明確的定性和定量改進。除了無條件的生成外，我們還表明我們的模型可用於求解各種任務，例如用戶指導的編輯，單個掃描的場景完成以及文本到3D。

有關業務查詢，請訪問我們的網站並提交表格：NVIDIA研究許可。有關與模型有關的任何其他問題，請聯繫Xuanchi或Jiahui。

消息

2024-10-27：查看我們的Neurips 2024工作Scube，該工作SCUBE在大規模場景重建上延伸XCUBE！
2024-06-18：發布的代碼和模型！

環境設置

請注意，我們目前僅支持Linux。我們歡迎支持其他平台。

（可選）安裝Libmamba，以改善Conda時的生活質量

 conda update -n base conda
conda install -n base conda-libmamba-solver
conda config --set solver libmamba

康達環境

 # Clone the repository
git clone [email protected]:nv-tlabs/XCube.git
cd XCube

# Create conda environment
conda env create -f environment.yml
conda activate xcube

# Install fVDB (3D learning framework; require GPU later than Ampere)
git clone https://github.com/AcademySoftwareFoundation/openvdb.git
cd openvdb
git fetch origin pull/1808/head:feature/fvdb
git checkout feature/fvdb
rm fvdb/setup.py && cp ../assets/setup.py fvdb/
cd fvdb && pip install .
cd ../..

# Mesh extraction
cd ext/nksr-cuda
python setup.py develop
cd ../..

Docker圖像

對於Docker用戶，我們建議使用此處的基本圖像，並在其上應用上述CONDA設置。

Quickstart

從Google Drive下載驗證的檢查點，並將其放在checkpoints下。另外，我們提供一個可以自動為您下載所有內容的腳本（暫時不可用）：

 python inference/download_pretrain.py

變形推理：

 # Chair
python inference/sample_shapenet.py none --category chair --total_len 20 --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Car
python inference/sample_shapenet.py none --category car --total_len 20 --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Plane
python inference/sample_shapenet.py none --category plane --total_len 20 --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Visualize
python visualize_object.py -p results/{YOUR_PATH} -i {YOUR_ID}

Waymo推理：

 # Unconditional sampling
python inference/sample_waymo.py none --total_len 20 --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Single-scan condition (coming soon)

# Visualize
python visualize_scene.py -p results/{YOUR_PATH} -i {YOUR_ID}

objaverse推斷：

 # Text to 3D
python inference/sample_objaverse.py none --batch_len 4 --ema --use_ddim --ddim_step 100 --extract_mesh

# Visualize
python visualize_object.py -p results/{YOUR_PATH} -i {YOUR_ID}

已發布的代碼與本文中描述的版本有所不同：
對於清潔代碼，省略了細化網絡，這可能會導致結果略有差異，但是這些差異並不顯著。
網狀提取過程已從VAE轉移到後處理。

我們已經在XCube Misc準備了有關數據準備和有用技巧的詳細說明。

訓練

數據下載鏈接：

Shapenet：可在此處獲得數據。將提取的文件夾作為../data/shapenet 。或者，您可以在配置中更改_shapenet_path 。
Waymo：即將來臨

（粗）階段1

培訓自動編碼器模型：

 # ShapeNet chair
python train.py ./configs/shapenet/chair/train_vae_16x16x16_dense.yaml --wname 16x16x16-kld-0.03_dim-16 --max_epochs 100 --cut_ratio 16 --gpus 8 --batch_size 32

# ShapeNet car
python train.py ./configs/shapenet/car/train_vae_16x16x16_dense.yaml --wname 16x16x16-kld-0.03_dim-16 --max_epochs 100 --cut_ratio 16 --gpus 8 --batch_size 32

# ShapeNet plane
python train.py ./configs/shapenet/plane/train_vae_16x16x16_dense.yaml --wname 16x16x16-kld-0.03_dim-16 --max_epochs 100 --cut_ratio 16 --gpus 8 --batch_size 32

# Waymo uncond
python train.py ./configs/waymo/train_vae_32x32x32_dense.yaml --wname 32x32x32-kld-0.03_dim-8 --max_epochs 50 --gpus 8 --batch_size 32 --eval_interval 1

訓練潛在擴散模型：

 # ShapeNet chair
python train.py ./configs/shapenet/chair/train_diffusion_16x16x16_dense.yaml --wname 16x16x16_kld-0.03 --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 4

# ShapeNet car
python train.py ./configs/shapenet/car/train_diffusion_16x16x16_dense.yaml --wname 16x16x16_kld-0.03 --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 4

# ShapeNet plane
python train.py ./configs/shapenet/plane/train_diffusion_16x16x16_dense.yaml --wname 16x16x16_kld-0.03 --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 4

# Waymo uncond
python train_auto.py ./configs/waymo/train_diffusion_32x32x32_dense.yaml --wname 32x32x32_kld-0.03 --eval_interval 1 --gpus 8 --batch_size 16 --accumulate_grad_batches 4 --save_topk 2

（罰款）階段2

培訓自動編碼器模型：

 # ShapeNet chair
python train.py ./configs/shapenet/chair/train_vae_128x128x128_sparse.yaml --wname 512_to_128-kld-1.0 --max_epochs 100 --gpus 8 --batch_size 8 --accumulate_grad_batches 2

# ShapeNet car
python train.py ./configs/shapenet/car/train_vae_128x128x128_sparse.yaml --wname 512_to_128-kld-1.0 --max_epochs 100 --gpus 8 --batch_size 8 --accumulate_grad_batches 2

# ShapeNet plane
python train.py ./configs/shapenet/plane/train_vae_128x128x128_sparse.yaml --wname 512_to_128-kld-1.0 --max_epochs 100 --gpus 8 --batch_size 8 --accumulate_grad_batches 2

# Waymo uncond
python train.py ./configs/waymo/train_vae_256x256x256_sparse.yaml --wname 1024_to_256-kld-0.3 --max_epochs 50 --gpus 8 --batch_size 8 --accumulate_grad_batches 2

訓練潛在擴散模型：

 # ShapeNet chair
python train.py ./configs/shapenet/plane/train_diffusion_128x128x128_sparse.yaml --wname 128x128x128_kld-1.0_normal_cond --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 8 --save_topk 2 --save_every 30

# ShapeNet car
python train.py ./configs/shapenet/car/train_diffusion_128x128x128_sparse.yaml --wname 128x128x128_kld-1.0_normal_cond --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 8 --save_topk 2 --save_every 30

# ShapeNet plane
python train.py ./configs/shapenet/car/train_diffusion_128x128x128_sparse.yaml --wname 128x128x128_kld-1.0_normal_cond --eval_interval 5 --gpus 8 --batch_size 8 --accumulate_grad_batches 8 --save_topk 2 --save_every 30

# Waymo uncond
python train.py ./configs/waymo/train_diffusion_256x256x256_sparse.yaml --wname 256x256x64_kld-0.3_semantic_cond --eval_interval 1 --gpus 8 --batch_size 8 --accumulate_grad_batches 4 --save_topk 1

此外，您可以手動指定不同的培訓設置，以獲取適合您需求的模型。普通標誌包括：

--wname ：為WANDB LOGGER指定的其他實驗名稱。
--batch_size ： autoencoder總數的批量數量，每gpu的批次數量進行diffusion 。
--logger_type ：我們默認使用wandb ;也none支持。

執照

引用

 @inproceedings { ren2024xcube ,
    title = { XCube: Large-Scale 3D Generative Modeling using Sparse Voxel Hierarchies } , 
    author = { Ren, Xuanchi and Huang, Jiahui and Zeng, Xiaohui and Museth, Ken and Fidler, Sanja and Williams, Francis } ,
    booktitle = { Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition } ,
    year = { 2024 }
}