Learn Before
Concept
Tokenizing
Tokenizing means to split up text by word or by sentence. This process provides the first step for data analysis. To use tokenizing features, import:
from nltk.tokenize import sent_tokenize, word_tokenize
sent_tokenize will split up the text by sentence and word_tokenize will split up the text by word. Both functions return a list of strings.
0
1
Updated 2021-08-03
Tags
Python Programming Language
Data Science