Lingua::RU::Antimat - Perl Module for removal Russian slang from chat, guestbooks, etc.
use POSIX qw(locale_h);
use Lingua::RU::Antimat;
use locale;
setlocale(LC_CTYPE,``ru_RU.CP1251'');
$dirty_text='text with slang';
$mat= Lingua::RU::Antimat->new;
#load dictionary with additional words
$mat->load_dict('/home/www/badwords');
$mat->set_bip('Sorry!');
$clean_text=$mat->remove_slang($dirty_text);
Detailed Russian documentation and tutorial available on http://www.tcen.ru/antimat
This module will remove Russian slang from a string. 'Mat' is Russian name for such bad words and that is why this module is called Antimat.
new() is called without any arguments, the module will use templates for text in encoding win-1251.
If your text in encoding KOI8-R set $codepage equal 'koi8'.
Examples:
$mat=Lingua::RU::Antimat->new; #for text in win-1251
$mat=Lingua::RU::Antimat->new('koi8'); #for text in KOI8-R
Examples:
$mat->set_bip(''); #let strip out slang
$mat->set_bip('I am sorry!'); #long but also correct
Detailed Russian documentation on http://www.tcen.ru/antimat
perllocale manpage
Andrey Skorohod, marlenus@marlenus.com for his bug reports.
Vladimir Zhdanov, vovka@lg.kamaz.net for his bug report.
Andrey Sharapov, Sharapov@tut.by for his suggestions.
Yury Voloshin, xtc@norilsk.net for his bug report and suggestions.
Thanks!
Ilya Soldatkin, arc@tcen.ru
Drop me a line if you deploy this module on your site. Think about this as a small contribution to my efforts for writing and supporting this module. I can not improve this module if I will know that no one uses it.
Copyright 2001 Ilya Soldatkin. All rights reserved.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.