通情达理 发表于 2025-3-23 11:35:02

http://reply.papertrans.cn/29/2818/281740/281740_11.png

Atmosphere 发表于 2025-3-23 14:33:53

http://reply.papertrans.cn/29/2818/281740/281740_12.png

单调性 发表于 2025-3-23 18:07:20

http://reply.papertrans.cn/29/2818/281740/281740_13.png

弯曲的人 发表于 2025-3-23 23:30:22

D3PG: Decomposed Deep Deterministic Policy Gradient for Continuous Control,used deep reinforcement learning algorithms. Another advantage of D3PG is that it is able to provide explicit interpretations of the final learned policy as well as the underlying dependencies among the joints of a learning robot.

Armory 发表于 2025-3-24 05:13:06

Efficient Exploration by Novelty-Pursuit,ironments from simple maze environments, MuJoCo tasks, to the long-horizon video game of SuperMarioBros. Experiment results show that the proposed method outperforms the state-of-the-art approaches that use curiosity-driven exploration.

运动吧 发表于 2025-3-24 08:19:26

Battery Management for Automated Warehouses via Deep Reinforcement Learning,ssian noise to enforce exploration could perform poorly in the formulated MDP, and present a novel algorithm called TD3-ARL that performs effective exploration by regulating the magnitude of the outputted action. Finally, extensive empirical evaluations confirm the superiority of our algorithm over the state-of-the-art and the rule-based policies.

生锈 发表于 2025-3-24 11:35:58

Parallel Algorithm for Nash Equilibrium in Multiplayer Stochastic Games with Application to Naval Such settings can be modeled as stochastic games. While algorithms have been developed for solving (i.e., computing a game-theoretic solution concept such as Nash equilibrium) two-player zero-sum stochastic games, research on algorithms for non-zero-sum and multiplayer stochastic games is limited. We

粗糙 发表于 2025-3-24 16:28:06

LAC-Nav: Collision-Free Multiagent Navigation Based on the Local Action Cells,ets, attentions should be paid to avoid the collisions with the others. In this paper, we introduced the concept of the ., which provides for each agent a set of velocities that are safe to perform. Consequently, as long as the local action cells are updated on time and each agent selects its motion

medieval 发表于 2025-3-24 20:43:11

MGHRL: Meta Goal-Generation for Hierarchical Reinforcement Learning, space. Such algorithms work well in tasks with relatively slight differences. However, when the task distribution becomes wider, it would be quite inefficient to directly learn such a meta-policy. In this paper, we propose a new meta-RL algorithm called Meta Goal-generation for Hierarchical RL (MGH

施魔法 发表于 2025-3-25 02:09:04

D3PG: Decomposed Deep Deterministic Policy Gradient for Continuous Control,gh dimensional robotic control problems. In this regard, we propose the D3PG approach, which is a multiagent extension of DDPG by decomposing the global critic into a weighted sum of local critics. Each of these critics is modeled as an individual learning agent that governs the decision making of a
页: 1 [2] 3 4 5
查看完整版本: Titlebook: Distributed Artificial Intelligence; Second International Matthew E. Taylor,Yang Yu,Yang Gao Conference proceedings 2020 Springer Nature Sw