Learn Before
Comparing 'Action' in Different Reinforcement Learning Scenarios
Consider two reinforcement learning agents: one playing a chess game and another generating text. Contrast the nature of an 'action' for the chess-playing agent versus the text-generating agent. Specifically, what does a single action entail in each scenario?
0
1
Tags
Ch.4 Alignment - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
Policy Formula for LLMs in Reinforcement Learning
A language model is generating a response to the prompt 'The best way to learn a new skill is to...'. So far, it has produced the sequence 'The best way to learn a new skill is to practice'. At this exact point in the generation process, what constitutes the model's next 'action' within a reinforcement learning framework?
Comparing 'Action' in Different Reinforcement Learning Scenarios
Identifying the Action in LLM Fine-Tuning