流逝
发表于 2025-3-30 11:03:58
http://reply.papertrans.cn/17/1664/166358/166358_51.png
挑剔为人
发表于 2025-3-30 14:28:38
http://reply.papertrans.cn/17/1664/166358/166358_52.png
Excitotoxin
发表于 2025-3-30 17:14:30
Dynamic Shielding for Reinforcement Learning in Black-Box Environmentsve been various proposals to reduce undesired behaviors during learning, most of these techniques require prior system knowledge, and their applicability is limited. This paper aims to reduce undesired behaviors during learning without requiring . prior system knowledge. We propose .: an extension o