流逝 发表于 2025-3-30 11:03:58
http://reply.papertrans.cn/17/1664/166358/166358_51.png挑剔为人 发表于 2025-3-30 14:28:38
http://reply.papertrans.cn/17/1664/166358/166358_52.pngExcitotoxin 发表于 2025-3-30 17:14:30
Dynamic Shielding for Reinforcement Learning in Black-Box Environmentsve been various proposals to reduce undesired behaviors during learning, most of these techniques require prior system knowledge, and their applicability is limited. This paper aims to reduce undesired behaviors during learning without requiring . prior system knowledge. We propose .: an extension o