Can Human Feedback Improve Diffusion Models Without a Reward Model?

Original title: Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Authors: Kai Yang, Jian Tao, Jiafei Lyu, Chunjiang Ge, Jiaxin Chen, Qimai Li, Weihan Shen, Xiaolong Zhu, Xiu Li

In the world of refining diffusion models, the path to perfection often involves reinforcement learning with human feedback (RLHF). However, existing methods demand intricate reward models aligned with human preferences, a costly and time-consuming endeavor.

Original article: https://arxiv.org/abs/2311.13231