Re: serious self confusion over line counts and hashs.

Jim Gibson Sun, 25 Apr 2010 16:46:39 -0700

At 4:37 PM -0500 4/25/10, Harry Putnam wrote:


I do have another question that was only in the background of my first
post.

Is there a canonical way to read a bunch of -f type files into a hash?


I take it you mean add the file names to a hash, not the file contents.


I want the end name `$_' on one side and full name `File::Find::name'
on the other...

The "end name" is called the "file name". What comes before the filename is referred to as the "directory" or "directory path". The wholestring is referred to as the "path" or "full path".

what happens is the keys start slaughtering each other if you get it
the wrong way round... and even when it ends up right... I wonder
there may still be some chance of names canceling

Hash keys must be unique. If you are worried about key collision (twokeys the same), always test whether a key already exists beforeinserting it into a hash.

Doing it like this:

    $h1{$File::Find::name} = $_

So far, has agreed with the count I see from `wc -l'.  I'd like to
know for sure though if that is a reliable way to do it?

That is the reliable way to generate a hash of all files in adirectory tree. Since full paths must be unique on a system (else howcould the operating system find the file?), a full path specificationmust be unique. The reverse (inverse, obverse?) is not true: becauseof links and aliases, two full path strings could refer to the samefile.

And is there some kind of handy way to turn a hash into a scalar like
can be done to arrays with File::Slurp

Arrays can be transformed to scalars by the join function.File::Slurp can either return the contents of a file as a singlescalar or as an array, one line per array element. It doesn't reallyturn an array into a scalar.


What I'm after is a way to grep the list of full names using the
endnames of a similar but not identical list, in order to discover
which names are in the longer list, but not the shorter list.



Hashes are the best data structure to use for this purpose.


Writing it to file is one way.  And it seem likely to be the better
way really since the lists can be pretty long.

I wondered if this can all be done in the script with hashes somehow.

I suggest you try implementing an algorithm using hashes. Your method(looking for substrings in a string containing all file names), isneedlessly inefficient and prone to error).


--
Jim Gibson
[email protected]

--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/

Re: serious self confusion over line counts and hashs.

Reply via email to