# xtokenizer **Repository Path**: summry/xtokenizer ## Basic Information - **Project Name**: xtokenizer - **Description**: Tokenizer - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-12-27 - **Last Updated**: 2025-07-02 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README Usage Sample '''''''''''' .. code:: python from xtokenizer import Tokenizer tokenizer = Tokenizer.from_texts(texts, min_freq=5) sent = 'I love you' tokens = tokenizer.encode(sent, max_length=6) # [101, 66, 88, 99, 102, 0] sent = tokenizer.decode(tokens) # ['', 'I', 'love', 'you', '', '']