Concept

Actor-Critic Methods

Compared to value based methods, actor-critic methods focus on modeling the probability distribution of policies. For value-based methods, the logic is more like a greedy algorith, we want to maximize the value function for every step. As a result, we need to search through the action space to get the best action. But in actor-critic method, they are like using actor to perform actions based on the probability distribution of policies and using critics to judge the performance and adjust the probability distribution. They don't directly always choose the best action.

0

2

Updated 2026-05-01

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences