text_cleaner Download - text_cleaner Source code download

text_cleaner

AI Source Code

1.0.0

Download

Text cleaning with Natural Language Processing

? in progress

Python library using Natural Language Processing (NLP) to easily and quickly clean text.

Automaticaly tokenize text, remove punctuation and special characters, normalize the case, remove stopwords in various languages, stem words... with this simple yet customizable library.

Usage

Install :

pip install pytext_cleaner

Example :

from pytext_cleaner import TextCleaner

cleaner = TextCleaner()
cleaner.settings = ['rm_punctuation', 'rm_numeric', 'lowerize']
cleaner.lang_setting = ['italian', 'french']
clean_text = cleaner.clean_text(string_to_clean)

Customize

Default settings: ['rm_punctuation', 'rm_numeric', 'lowerize', 'rm_stopwords']

Available settings are :

rm_punctuation
rm_numeric
lowerize
rm_stopwords
stem_words
rm_long_words

Default language settings: ['english']

To include or exclude stopwords:

cleaner.white_list = ['words', 'to', 'include']
cleaner.black_list = ['words', 'to', 'exclude']

Change return type:

By default, text_cleaner return a modified string.

To return of list of tokens, add tokenize=True:

cleaner.clean_text(string_to_clean, tokenize=True)

Expand

Additional Information

Version 1.0.0
Type AI Source Code
Update Time 2025-09-02
size 5.53KB
From Github

Related Applications

Text With Jesus

2023-08-17
Text or Die

2023-07-03
Home Cleaner Home Cleaner

2023-06-05
Wise Disk Cleaner

2009-06-22
Duplicate Cleaner

2009-06-03
Wise Registry Cleaner

2009-05-30

Recommended for You

chat.petals.dev

Other source code

1.0.0
GPT Prompt Templates

Other source code

1.0.0
GPTyped

Other source code

GPTyped 1.0.5
ML stack

AI Source Code

1.0.0
awesome free chatgpt

AI Source Code

1.0.0
pywin_contextmenu

AI Source Code

Version update
Google Dorks

Other source code

1.0
shepherd

Other source code

v6.1.6-react-shepherd: Prepare Release (#3063)
mongo express

Other source code

v1.1.0-rc-3

Related Information All