Learn Before
Example

Example of Tokenization into Words and Punctuation

A simple and straightforward approach to tokenization is to segment a text into individual English words and punctuation marks. For instance, given the text "I love the food here. It's amazing!", it can be broken down into the following sequence of tokens: {I,love,the,food,here,.,It,’s,amazing,!}\left\{ \textrm{I}, \textrm{love}, \textrm{the}, \textrm{food}, \textrm{here}, \textrm{.}, \textrm{It}, \textrm{'s}, \textrm{amazing}, \textrm{!} \right\}.

0

1

Updated 2026-04-14

Contributors are:

Who are from:

Tags

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models

Foundations of Large Language Models Course

Computing Sciences