MARL Partially Observable

📖

terms

POMDP (Partially Observable Markov Decision Process)

Theoretical framework modeling environments where the agent perceives only a partial observation of the true state, requiring probabilistic inference about the hidden state to make optimal decisions.

📖

terms

Observation Space

Set of partial sensory signals that each agent can perceive from the environment, representing incomplete information about the global state of the system.

📖

terms

Belief State

Probability distribution over the hidden state space that an agent maintains and updates from its successive observations to represent its uncertainty about the true state of the environment.

📖

terms

Communication Protocol

Mechanism defining when, how, and what information agents can exchange among themselves to coordinate their actions in a partially observable environment.

📖

terms

Centralized Training with Decentralized Execution

Approach where agents train using global information (states, actions of all agents) but execute their policies individually using only their local observations.

📖

terms

Value Function Factorization

Technique decomposing the global value function into a sum of individual or local value functions, enabling decentralized learning while preserving global consistency.

📖

terms

Adversary Modeling

Process of inferring the policies or intentions of other agents based on their observed behaviors, crucial for decision-making in competitive or cooperative environments.

📖

terms

Credit Assignment Problem

Difficulty in correctly attributing the global reward to each agent in a multi-agent system, particularly complex when observations are partial and actions are interdependent.

📖

terms

Joint Action Learning

Method where agents learn to coordinate their actions by explicitly modeling the impact of combined actions on the global reward, despite partial observability.

📖

terms

State Estimation

Algorithmic process allowing an agent to infer the most probable global state from its local observations and its model of the environment.

📖

terms

Information Sharing

Strategy defining how agents distribute and aggregate their local observations to improve the collective knowledge of the environment's state.

📖

terms

Local Observation History

Temporal sequence of an agent's past observations, used as additional context to compensate for the lack of information about the current global state.

📖

terms

Multi-agent Partial Observability

Condition where no individual agent can observe the complete state of the system, requiring coordination and inference strategies to achieve optimal performance.

📖

terms

Decentralized Policy

Decision function for each agent that maps its local observation history to an action, without direct dependence on other agents' information during execution.

📖

terms

Common Knowledge

Information that all agents know and know that others also know, essential for coordination in partially observable environments.

📖

terms

Coordination Graph

Structure representing interaction dependencies between agents, allowing the global decision problem to be factored into easier-to-solve local subproblems.

AI Glossary

POMDP (Partially Observable Markov Decision Process)

Observation Space

Belief State

Communication Protocol

Centralized Training with Decentralized Execution

Value Function Factorization

Adversary Modeling

Credit Assignment Problem

Joint Action Learning

State Estimation

Information Sharing

Local Observation History

Multi-agent Partial Observability

Decentralized Policy

Common Knowledge

Coordination Graph

No results found