September 18, 2020

Tokenization in NLP Tutorial

Tokenization is the process of splitting a chunk of text, phrase, or sentence into smaller units called tokens. The smaller units could be individual words or terms. Tokenization is a pivotal step for extracting information from textual data