Concept

Tokenization

Tokenization is the process of converting a sequence of text into smaller units, known as tokens. It is a foundational step in Natural Language Processing, and there are numerous different methods and strategies for how a text can be tokenized.

0

1

Updated 2025-10-06

Tags

Data Science

Foundations of Large Language Models Course

Computing Sciences

Ch.1 Pre-training - Foundations of Large Language Models

Foundations of Large Language Models