At 4:37 PM -0500 4/25/10, Harry Putnam wrote:

I do have another question that was only in the background of my first
post.

Is there a canonical way to read a bunch of -f type files into a hash?

I take it you mean add the file names to a hash, not the file contents.


I want the end name `$_' on one side and full name `File::Find::name'
on the other...

The "end name" is called the "file name". What comes before the file name is referred to as the "directory" or "directory path". The whole string is referred to as the "path" or "full path".

what happens is the keys start slaughtering each other if you get it
the wrong way round... and even when it ends up right... I wonder
there may still be some chance of names canceling

Hash keys must be unique. If you are worried about key collision (two keys the same), always test whether a key already exists before inserting it into a hash.

Doing it like this:

    $h1{$File::Find::name} = $_

So far, has agreed with the count I see from `wc -l'.  I'd like to
know for sure though if that is a reliable way to do it?


That is the reliable way to generate a hash of all files in a directory tree. Since full paths must be unique on a system (else how could the operating system find the file?), a full path specification must be unique. The reverse (inverse, obverse?) is not true: because of links and aliases, two full path strings could refer to the same file.

And is there some kind of handy way to turn a hash into a scalar like
can be done to arrays with File::Slurp

Arrays can be transformed to scalars by the join function. File::Slurp can either return the contents of a file as a single scalar or as an array, one line per array element. It doesn't really turn an array into a scalar.


What I'm after is a way to grep the list of full names using the
endnames of a similar but not identical list, in order to discover
which names are in the longer list, but not the shorter list.


Hashes are the best data structure to use for this purpose.


Writing it to file is one way.  And it seem likely to be the better
way really since the lists can be pretty long.

I wondered if this can all be done in the script with hashes somehow.


I suggest you try implementing an algorithm using hashes. Your method (looking for substrings in a string containing all file names), is needlessly inefficient and prone to error).

--
Jim Gibson
[email protected]

--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/


Reply via email to