Analyzing a Tokenization Function's Output
A developer has written a function to segment text into a sequence of tokens. The rule for segmentation is to separate the text into individual English words and treat each punctuation mark as its own distinct unit. Analyze the function's output for the given input sentence and identify the primary logical error, explaining your reasoning based on the segmentation rule and the provided reference example.
0
1
Tags
Ch.1 Pre-training - Foundations of Large Language Models
Foundations of Large Language Models
Foundations of Large Language Models Course
Computing Sciences
Analysis in Bloom's Taxonomy
Cognitive Psychology
Psychology
Social Science
Empirical Science
Science
Related
A piece of text is segmented into a sequence of smaller units by separating it into individual words and treating each punctuation mark as its own distinct unit. Given this method, which of the following options correctly represents the segmentation of the sentence: "She said, 'It's great!'"?
Applying Word and Punctuation Segmentation
Analyzing a Tokenization Function's Output