Safe Entropic Agents under Team Constraints

Safety is a critical concern in multi-agent reinforcement learning (MARL), yet typical safety-aware methods constrain agent behaviors, limiting exploration-essential for discovering effective cooperation. Existing approaches mainly enforce individual constraints, overlooking potential benefits of joint (team) constraints. We analyze team constraints theoretically and practically, introducing entropic exploration for constrained MARL (E2C). E2C maximizes observation entropy to encourage exploration while ensuring safety at the individual and team levels. Experiments across diverse domains demonstrate that E2C matches or outperforms common baselines in task performance while reducing unsafe behaviors by up to 50%.

Citation

Aydeniz, Ayhan Alp, et al. "Safe Entropic Agents under Team Constraints." Proceedings of the 24th International Conference on Autonomous Agents and Multiagent Systems. 2025.

Authors from IE Research Datalab