流逝 发表于 2025-3-30 11:03:58

http://reply.papertrans.cn/17/1664/166358/166358_51.png

挑剔为人 发表于 2025-3-30 14:28:38

http://reply.papertrans.cn/17/1664/166358/166358_52.png

Excitotoxin 发表于 2025-3-30 17:14:30

Dynamic Shielding for Reinforcement Learning in Black-Box Environmentsve been various proposals to reduce undesired behaviors during learning, most of these techniques require prior system knowledge, and their applicability is limited. This paper aims to reduce undesired behaviors during learning without requiring . prior system knowledge. We propose .: an extension o
页: 1 2 3 4 5 [6]
查看完整版本: Titlebook: Automated Technology for Verification and Analysis; 20th International S Ahmed Bouajjani,Lukáš Holík,Zhilin Wu Conference proceedings 2022