Home Overview Results BibTex References Acknowledgements

Our DANBO is a robust neural body model that can be animated with diverse skeleton poses. DANBO learns directly from images and skeleton poses that can stem from a off-the-shelf pose estimator. Despite not exploiting 3D scans and pre-trained parameteric models for training, DANBO shows high rendering quality that rivals even the surface-based approach. Real faces are blurred for anonymity.


DANBO models a human body as a neural radiance field. We introduce two inductive biases to enable learning plausible and robust body geometry. First, we exploit body part dependencies defined by the skeleton structure using Graph Neural Networks. Second, we predict for each bone a part-specific volume that encodes the local geometry feature. For each 3D query point in the space, our aggregation network blends the associated voxel features with our proposed soft-softmax aggregation function that ensures better robustness and generalizability.

Comparison on Human3.6M unseen poses

Reconstructed body geometry on Human3.6M unseen poses

Animating extremely challenging unseen poses from CMU Mocap

Novel view synthesis with DANBO


Shih-Yang Su, Timur Bagautdinov, and Helge Rhodin. "DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks", arXiv, 2022
        title={DANBO: Disentangled Articulated Neural Body Representations via Graph Neural Networks},
        author={Su, Shih-Yang and Bagautdinov, Timur and Rhodin, Helge},
        booktitle = {European Conference on Computer Vision}
Human3.6M dataset [1]
    author = {Ionescu, Catalin and Papava, Dragos and Olaru, Vlad and Sminchisescu,  Cristian},
    title = {Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments},
    journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
    publisher = {IEEE Computer Society},
    volume = {36},
    number = {7},
    pages = {1325-1339},
    month = {jul},
    year = {2014}

    author = {Catalin Ionescu, Fuxin Li, Cristian Sminchisescu},
    title = {Latent Structured Models for Human Pose Estimation},
    booktitle = {International Conference on Computer Vision},
    year = {2011}
    author = {W. Xu and A. Chatterjee and M. Zollh{\"o}fer and H. Rhodin and D. Mehta and H.-P. Seidel and C. Theobalt},
    title = {{Monoperfcap: Human Performance Capture from Monocular Video}},
    journal = TOG,
    volume = "37",
    number = "2",
    pages = "27",
    year = 2018
      title     = {Learning from Synthetic Humans},  
      author    = {Varol, G{\"u}l and Romero, Javier and Martin, Xavier and Mahmood, Naureen and Black, Michael J. and Laptev, Ivan and Schmid, Cordelia},  
      booktitle = {CVPR},  
      year      = {2017}  


We thank Sida Peng for helpful discussions related to Animatable NeRF. We thank Yuliang Zou, Chen Gao, Eric Hedlin, Meng-Li Shih, Hui-Po Wang, Abi Kuganesan, and Daniel Ajisafe for many insightful discussions and feedbacks. We also thank Advanced Research Computing at the University of British Columbia and Compute Canada for providing computational resources.