Reinforcement Learning from Human Feedback for Cyber-Physical Systems: On the Potential of Self-Supervised Pretraining
In Proceedings of the International Conference on Machine Learning for Cyber-Physical Systems (ML4CPS), 2023
Abstract
In this paper, we advocate for the potential of reinforcement learning from human feedback (RLHF) with self-supervised pretraining to increase the viability of reinforcement learning (RL) for real-world tasks, especially in the context of cyber-physical systems (CPS). We identify potential benefits of self-supervised pretraining in terms of the query sample complexity, safety, robustness, reward exploration and transfer. We believe that exploiting these benefits, combined with the generally improving sample efficiency of RL, will likely enable RL and RLHF to play an increasing role in CPS in the future.
Cite
@inproceedings{kaufmann2023reinforcement,
slug = {rlhf-cps},
title = {Reinforcement {{Learning}} from~{{Human Feedback}} for~{{Cyber-Physical Systems}}: {{On}} the~{{Potential}} of~{{Self-Supervised Pretraining}}},
booktitle = {Proceedings of the {{International Conference}} on {{Machine Learning}} for {{Cyber-Physical Systems}} ({{ML4CPS}})},
author = {Kaufmann, Timo and Bengs, Viktor and H{\"u}llermeier, Eyke},
year = {2023},
publisher = {Springer Nature Switzerland},
doi = {10.1007/978-3-031-47062-2_2}
}