1Cademy - Tokens vs. Words in NLP

Learn Before

The Pre-training and Fine-tuning Paradigm

Definition

Tokens vs. Words in NLP

In Natural Language Processing, tokens are the fundamental units of text created through a process called tokenization. While the terms 'token' and 'word' are often used interchangeably for simplicity, they have distinct meanings, as tokens are the basic building blocks that models process and may not always correspond directly to words.

Updated 2025-10-08

Contributors are: