Doug Cacialli <[email protected]> asked:
> Does anyone have any ideas how I can make the second block of code
> work? Or otherwise accomplish the task without opening the .txt file
> twice?
How large are your data files? If your available memory is much larger than
your maximum file size, you might get away with slurping the file into a scalar
and then convert its encoding if needed, possibly like this:
#!/usr/bin/perl -w
use strict;
use Encode;
my $file = 'test.txt';
open( my $fh, '<', $file ) or die "Can't open '$file': $!";
my $data = do {
local $/ = undef;
<$fh>;
};
close( $fh );
if( $data =~ m/^\xff\xfe/ || $data =~ m/^\xfe\xff/ ){
print "input is UTF-16 w/ BOM\n";
$data = decode('utf-16',$data);
} elsif( $data =~ m/^[^\x00]\x00/ ){
print "input is probably little-endian utf-16 w/o BOM\n";
$data = "\xff\xfe" . $data;
$data = decode('utf-16',$data);
} elsif( $data =~ m/^\x00[^\x00]/ ){
print "input is probably big-endian utf-16 w/o BOM\n";
$data = "\xfe\xff" . $data;
$data = decode('utf-16',$data);
}
chomp( $data);
my @words = split /\s+/, $data;
print "input file has" . scalar( @words ) . " words\n";
__END__
HTH,
Thomas
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/