Can we use dueling bandit in preference-based evolutionary multi-objective optimization?

Original title: Direct Preference-Based Evolutionary Multi-Objective Optimization with Dueling Bandit

Authors: Tian Huang, Ke Li

In an article, the authors discuss the use of optimization problems in various scenarios. They explain that users often want solutions that converge to a specific region of interest along the Pareto front. Instead of using traditional methods that approximate fitness functions, the authors propose a new approach that relies solely on human feedback. They introduce an active dueling bandit algorithm for direct preference learning. The authors then conduct three sessions of experiments to validate their approach. They first assess the performance of their algorithm, then implement it within the context of Multi-objective Evolutionary Algorithms (MOEAs), and finally deploy it in a practical protein structure prediction problem. The research presents a novel framework that not only addresses the limitations of traditional techniques but also opens up new possibilities for optimization problems.

Original article: https://arxiv.org/abs/2311.14003