Coin3D Enhances 3D Generation with Proxy-Guided Control but Faces Challenges
本文提出了一种基于3D代理的扩散模型方法,用于交互式3D对象生成。通过Proxy-Guided 3D Conditioning实现对扩散过程的三维控制,并结合交互式编辑与体积重建技术提升生成效果。实验表明该方法在可控性和细节表现上优于传统文本或图像驱动的方法。作者来自浙江大学与字节跳动团队。 2025-2-4 17:0:12 Author: hackernoon.com(查看原文) 阅读量:1 收藏

Abstract and 1 Introduction

2 Related Works

3 Method and 3.1 Proxy-Guided 3D Conditioning for Diffusion

3.2 Interactive Generation Workflow and 3.3 Volume Conditioned Reconstruction

4 Experiment and 4.1 Comparison on Proxy-based and Image-based 3D Generation

4.2 Comparison on Controllable 3D Object Generation, 4.3 Interactive Generation with Part Editing & 4.4 Ablation Studies

5 Conclusions, Acknowledgments, and References

SUPPLEMENTARY MATERIAL

A. Implementation Details

B. More Discussions

C. More Experiments

B MORE DISCUSSIONS

B.1 Necessity of Proxy-guided 3D Generation

For personalized generation demands, we think only using text / images is insufficient and also unintuitive for expressing 3D structures of objects and their spatial relationships. Hence, granting

Figure K: We compare our method with the depth-controlled 3D generation pipeline (Zero123++ [Shi et al. 2023a]).

Figure L: Runtime overview of interactive generation.

system 3D-aware controllability with 3D proxy is necessary for 3D generation. As for the acquisition of 3D proxies, we believe this is not an obstacle for target users, as it can be assembled easily using kids’ software like Tinkercad, taken from 3D modeling games from SteamVR, or using LLM+procedural modeling instructions. Similarly, ControlNet uses control images from raw sketches to delicate line art, which also requires basic painting skills.

B.2 More Limitations

First, the resolution of 3D-aware control is bounded by the size of the proxy feature volume, which cannot fully leverage control from complex high-poly models. For example, we cannot generate a large-scale urban scene with satisfactory building details. Second, our method requires manual tuning control strength to balance between over-constrained and under-constrained, which is also similar to ControlNet [Zhang et al. 2023] as the control strength mainly depends on the creators’ aesthetic choices.

Authors:

(1) Wenqi Dong, from Zhejiang University, and conducted this work during his internship at PICO, ByteDance;

(2) Bangbang Yang, from ByteDance contributed equally to this work together with Wenqi Dong;

(3) Lin Ma, ByteDance;

(4) Xiao Liu, ByteDance;

(5) Liyuan Cui, Zhejiang University;

(6) Hujun Bao, Zhejiang University;

(7) Yuewen Ma, ByteDance;

(8) Zhaopeng Cui, a Corresponding author from Zhejiang University.



文章来源: https://hackernoon.com/coin3d-enhances-3d-generation-with-proxy-guided-control-but-faces-challenges?source=rss
如有侵权请联系:admin#unsafe.sh