Short Answer

Analyzing Tokenization Strategies

Imagine two different systems are used to prepare the text 'unhappiness' for a language model. System A breaks it into one unit: ['unhappiness']. System B breaks it into two units: ['un', 'happiness']. Briefly explain one potential advantage of System B's approach over System A's approach, particularly when the model needs to process text it has not seen before.

0

1

Updated 2025-10-07

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences

Analysis in Bloom's Taxonomy

Cognitive Psychology

Psychology

Social Science

Empirical Science

Science