Frank Steinmetzger wrote:
> Am Fri, Sep 06, 2024 at 01:21:20PM +0100 schrieb Michael:
>
>>>> find path-to-directory/ -type f | xargs md5sum > digest.log
>>>>
>>>> then to compare with a backup of the same directory you could run:
>>>>
>>>> md5sum -c digest.log | grep FAILED
> I had a quick look at the manpage: with md5sum --quiet you can omit the grep 
> part.
>
>>>> Someone more knowledgeable should be able to knock out some clever python
>>>> script to do the same at speed.
> And that is exactly what I have written for myself over the last 11 years. I 
> call it dh (short for dirhash). As I described in the previous mail, I use 
> it to create one hash files per directory. But it also supports one hash 
> file per data file and – a rather new feature – one hash file at the root of 
> a tree. Have a look here: https://github.com/felf/dh
> Clone the repo or simply download the one file and put it into your path.
>
>>> I'll be honest here, on two points.  I'd really like to be able to do
>>> this but I have no idea where to or how to even start.  My setup for
>>> series type videos.  In a parent directory, where I'd like a tool to
>>> start, is about 600 directories.  On a few occasions, there is another
>>> directory inside that one.  That directory under the parent is the name
>>> of the series.
> In its default, my tool ignores directories which have subdirectories. It 
> only hashes files in dirs that have no subdirs (leaves in the tree). But 
> this can be overridden with the -f option.
>
> My tool also has an option to skip a number of directories and to process 
> only a certain number of directories.
>
>>> Sometimes I have a sub directory that has temp files;
>>> new files I have yet to rename, considering replacing in the main series
>>> directory etc.  I wouldn't mind having a file with a checksum for each
>>> video in the top directory, and even one in the sub directory.  As a
>>> example.
>>>
>>> TV_Series/
>>>
>>> ├── 77 Sunset Strip (1958)
>>> │   └── torrent
>>> ├── Adam-12 (1968)
>>> ├── Airwolf (1984)
> So with my tool you would do
> $ dh -f -F all TV_Series
> `-F all` causes a checksum file to be created for each data file.
>
>>> What
>>> I'd like, a program that would generate checksums for each file under
>>> say 77 Sunset and it could skip or include the directory under it.
> Unfortunately I don’t have a skip feature yet that skips specific 
> directories. I could add a feature that looks for a marker file and then 
> skips that directory (and its subdirs).
>

I was running the command again and when I was checking on it, it
stopped with this error. 



  File "/root/dh", line 1209, in <module>
    main()
  File "/root/dh", line 1184, in main
    directory_hash(dir_path, '', dir_files, checksums)
  File "/root/dh", line 1007, in directory_hash
    os.path.basename(old_sums[filename][1])
                     ~~~~~~~~^^^^^^^^^^
KeyError: 'Some Video.mp4'



I was doing a second run because I updated some files.  So, it was
skipping some and creating new for some new ones.  This is the command I
was running, which may not be the best way. 


/root/dh -c -f -F 1Checksums.md5 -v


That make any sense to you?  That's all it spit out. 

Also, what is the best way to handle this type of situation.  Let's say
I have a set of videos.  Later on I get a better set of videos, higher
resolution or something.  I copy those to a temporary directory then use
your dmv script from a while back to replace the old files with the new
files but with identical names.  Thing is, file is different, sometimes
a lot different.  What is the best way to get it to update the checksums
for the changed files?  Is the command above correct? 

I'm sometimes pretty good at finding software bugs.  But hey, it just
makes your software better.  ;-) 

Dale

:-)  :-) 

Reply via email to