On Sat, Nov 10, 2007 at 09:11:27PM +1300, Richard Toohey wrote:
> On 10/11/2007, at 10:05 AM, Daniel Ouellet wrote:
>
> >Otto Moerbeek wrote:
> >>stat -s gives the raw info in one go. Some shell script hacking
> >>should
> >>make it easy to detect sparse files.
> >
> >Thanks Otto for the suggestion. That might help until it can be
> >address for good. It would help speed up some of it. (;>
> >
>
> This looked interesting (curiosity killed the cat?), so I started
> looking at sparse files (not heard of them before.)
>
> Is this a sparse file?
yes.
>
> # dd if=/dev/zero of=sparsefile bs=1024 seek=10240 count=0
> 0+0 records in
> 0+0 records out
> 0 bytes transferred in 0.000 secs (0 bytes/sec)
> # ls -lh
> [--cut--]
> -rw-r--r-- 1 root wheel 10.0M Nov 11 08:43 sparsefile
> # du -hsc sparsefile
> 32.0K sparsefile
> 32.0K total
> # du sparsefile
> 64 sparsefile
> # stat -s sparsefile
> st_dev=7 st_ino=51969 st_mode=0100644 st_nlink=1 st_uid=0 st_gid=0
> st_rdev=0 st_size=10485760 st_atime=1194723829 st_mtime=1194723829
> st_ctime=1194723829 st_blksize=16384 st_blocks=64 st_flags=0
>
> So because blocks allocated = 64, and block size is (usually) 512
> bytes => file is 32K (but ls and others will report 10Mb size.)
>
> So if you scanned whatever director(y|ies) you are interested in,
>
> If st_size > (st_blocks * 512) Then
> *** this may be a sparse file?
>
> (BUT - blocksize of 16384 is reported so I must be missing something?)
yeah, look at stat(2):
int64_t st_blocks; /* blocks allocated for file */
u_int32_t st_blksize; /* optimal file sys I/O ops blocksize */
actually st_blocks's unit is disk sectors, to be precise.
I don't read perl, so I cannot comment on the script below.
-Otto
>
> A stab at it in Perl (lifted from Perl Cookbook):
>
> use strict;
> use warnings;
> use File::Find;
> sub process_file {
> my $f=$File::Find::name;
> (my $dev,my $ino,my $mode,my $nlink,my $uid,my $gid,my
> $rdev,my $size,my $atime,my $mtime,my $ctime,my $blksize,my $blocks)
> =sat($f);
> if ($blocks * 512 < $size) {
> print "\t$f => SZ: $size BLSZ: $blksize BLKS: $blocks
> \n";
> print "\t" . -s $f;
> print "\n";
> }
> }
> find(\&process_file,("/home/sparse-files"));
>
> The output is:
>
> # perl check.pl
> /home/sparse-files/sparsefile => SZ: 10485760 BLSZ: 16384
> BLKS: 64
> 10485760
>
> Thanks.