7 packages found. Page 1 of 1.

Name Version Votes Popularity? Description Maintainer Last Updated
python-spacy-alignments 0.9.1-1 2 0.12 A spaCy package for the Rust tokenizations library jnphilipp 2023-09-25 12:25 (UTC)
r-tokenizers 0.3.0-1 1 0.00 Fast, Consistent Tokenization of Natural Language Text BioArchLinuxBot 2022-12-22 12:02 (UTC)
uctodata 0.10.1-1 0 0.00 An advanced rule-based (regular-expression) and unicode-aware tokenizer for various languages. Tokenization is an essential first step in any NLP pipeline. This package contains the necessary data. proycon 2024-04-16 08:22 (UTC)
ucto 0.32.1-1 1 0.00 An advanced rule-based (regular-expression) and unicode-aware tokenizer for various languages. Tokenization is an essential first step in any NLP pipeline. proycon 2024-03-21 11:50 (UTC)
python-stanza 1.8.2-1 0 0.00 Stanford NLP Python library for tokenization, sentence segmentation, NER, and parsing of many human languages yochananmarqos 2024-04-20 20:43 (UTC)
python-fugashi 1.2.1-3 0 0.00 Cython MeCab wrapper for fast, pythonic Japanese tokenization and morphological analysis atticf 2023-01-23 08:56 (UTC)
lib32-libedit 20190324_3.1-1 13 0.00 Command line editor library providing generic line editing, history, and tokenization functions (32-bit) orphan 2019-05-12 16:14 (UTC)

7 packages found. Page 1 of 1.