2009/2/9 <[email protected]>:
> I know there is probably a simple solution for this, but can anyone help me
> with code to filter out messages with extended characters like these:
>
> àïåëüñèí
> êàïèëêà
> òàáëåòêè
> ïåðåêèñü
> òåëåôîí
> òåòðàäè
> êàðàíäàøè
> îá¸ðòêè îò êàíôåò
> êîðîáêà îò òåëåôîíà
> âàòà
> âàçà
> äèñêè
> êíèãè è òåòðàäè
> È åùå êó÷à áóìàæåê è âñÿêîãî õëàìà
>
> Just something to dump any messages with these characters. Is there a simple
> way?
>
Just by guess to decode them to utf8 since you didn't know which way
they are encoded. :-)
use Encode;
my @list = Encode->encodings(":all"); # get all encoding ways
for my $encoding (@list) {
print "decoded with $encoding:\n";
print encode("utf8",decode($encoding,$your_string) );
}
--
Jeff Peng
Office: +86-20-38350822
AIM: jeffpang
www.dtonenetworks.com