DreamReward: Aligning Human Preference in Text-to-3D Generation

3D content creation from text prompts has shown remarkable success recently. However, current text-to-3D methods often generate 3D results that do not align well with human preferences. In this paper, we present a comprehensive framework, coined DreamReward, to learn and improve text-to-3D models from human preference feedback. To begin with, we collect 25k expert comparisons based on a systematic annotation pipeline including rating and ranking. Then, we build Reward3D---the first general-purpose text-to-3D human preference reward model to effectively encode human preferences. Building upon the 3D reward model, we finally perform theoretical analysis and present the Reward3D Feedback Learning (DreamFL), a direct tuning algorithm to optimize the multi-view diffusion models with a redefined scorer. Grounded by theoretical proof and extensive experiment comparisons, our DreamReward successfully generates high-fidelity and 3D consistent results with significant boosts in prompt alignment with human intention. Our results demonstrate the great potential for learning from human feedback to improve text-to-3D models.

Figure 1. The overall framework of our DreamReward: (Top) Reward3D involves data collection, annotation, and preference learning. (Bottom) DreamFL utilizes feedback from Reward3D to compute RewardLoss and incorporate it into the SDS loss for simultaneous optimization of NeRF.

An ultra-detailed illustration of a mythical Phoenix, rising from ashes, vibrant feathers in a fiery palette

A delicate porcelain teacup, painted with intricate flowers, rests on a saucer

Spaceship,futuristic design,sleek metal,glowing thrusters, flying in space

A lion against the sunrise, its majestic stature prominent on the savanna

A bicycle that leaves a trail of flowers

A solid, symmetrical, smooth stone fountain, with water cascading over its edges into a clear, circular pond surrounded by blooming lilies, in the center of a sunlit courtyard

A pen leaking blue ink

A marble bust of a mouse

A smoldering campfire under a clear starry night, embers glowing softly

A rotary telephone carved out of wood

A japanese forest, sunny,digital art

Frog with a translucent skin displaying a mechanical heart beating.

A solid, smooth, symmetrical porcelain teapot, with a cobalt blue dragon design, steam rising from the spout, suggesting it's just been filled with boiling water

A book bound in mysterious symbols

A pen sitting atop a pile of manuscripts

Left: User study of the rate from volunteers' preference for each method in the inset pie chart, Right: Holistic evaluation using GPTEval3D. The Radar charts report the Elo rating for each of the 6 criteria. The results indicate that our results consistently rank first across all metrics.

BibTeX

@misc{ye2024dreamreward,
      title={DreamReward: Text-to-3D Generation with Human Preference}, 
      author={Junliang Ye and Fangfu Liu and Qixiu Li and Zhengyi Wang and Yikai Wang and Xinzhou Wang and Yueqi Duan and Jun Zhu},
      year={2024},
      eprint={2403.14613},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

DreamReward: Aligning Human Preference in Text-to-3D Generation

Abstract

Text-to-3D Generation

Quantitative Comparison

BibTeX