Short Answer
Analyzing Tokenization Strategies
Imagine two different systems are used to prepare the text 'unhappiness' for a language model. System A breaks it into one unit: ['unhappiness']. System B breaks it into two units: ['un', 'happiness']. Briefly explain one potential advantage of System B's approach over System A's approach, particularly when the model needs to process text it has not seen before.
0
1
Updated 2025-10-07
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science