Dyna-Q Learning
Planning function
Algorithmic component that performs repeated updates on stored experiences to refine value estimates without new environmental interaction.
← GeriAlgorithmic component that performs repeated updates on stored experiences to refine value estimates without new environmental interaction.
← Geri