Repetition and Exploration in Offline Reinforcement Learning-based Recommendations

Ming Li, Jin Huang, Maarten de Rijke

October, 2023

Abstract

Methods for reinforcement learning for recommendation (RL4Rec) have been gaining a substantial amount of attention, as they can optimize long-term user engagement. To avoid expensive online interactions with actual users, offline RL4Rec has been proposed to optimize methods based on logged user interactions. The evaluation of offline RL4Rec methods solely depends on the overall performance of the resulting recommendations, and thus may inaccurately reflect true performance. Instead, we conduct a novel study on the evaluation of offline RL4Rec methods from a repetition-and-exploration perspective, where we separately evaluate and compare the performance of recommending relevant repeat items (i.e., items that a user has interacted with) and exploratory items (i.e., items that the user did not interact with so far). Our experimental results reveal a significant disparity between repetition performance and exploration performance of RL4Rec methods. Furthermore, we find that the consideration of future gains sensitively affects the optimization of RL4Rec methods. Overall, our findings regarding repetition performance and exploration performance provide valuable insights for the future evaluation and optimization of RL4Rec methods.

Publication

In The 4th Workshop on Deep Reinforcement Learning for Information Retrieval at CIKM

Jin Huang

Postdoc

My research interest focuses on trustworthy advice-giving systems for real-world applications, especially in high-risk domains, e.g., healthcare, news recommendation, and public services.