To follow up and leave some pointers for people stumbling over this, I 
found a library that does almost all of the requirements above based on 
mature technologies (python and sqlite) at 
https://github.com/Vafilor/file-ops.
One remaining question is how best to generate a hash of a directory out of 
its content's hashes (that's the Merkle tree idea, right?). Is something 
naive like XORing the content hashes valid, or does one need a more complex 
combination operation? (hash the concatenated hashes?)
Best and good luck with the future of the project.
Diemo
On Monday, May 10, 2021 at 1:49:20 AM UTC+2 Diemo Schwarz wrote:

> Hi, excuse this terribly beginner question:
>
> *Could Perkeep be of use as a local media file indexing system (i.e. the 
> file data itself is not copied into perkeep but stays in the files on 
> disk)?*
> Use Case 
>    
>    - indexing of a (local) library of images, audio files, videos and 
>    their backups on several harddisks
>    - discovery of duplicate files
>    - verification that each file is present on at least 3 different 
>    backup hard disks
>    - harddisks can be unmounted, but their index should stay queryable
>    - media-specific metadata (image resolution, encoding, bit rate, etc.) 
>    should be queryable directly from the index
>
> Extra Points 
>    
>    1. if a media file could be represented as a block tree which nicely 
>    separates sub-blocks containing only metadata (e.g. EXIF for images) and 
>    raw content data. (This way, duplicate images with different metadata 
> could 
>    be detected)
>    2. if relationships between media files could be represented, i.e. if 
>    a photo is an edited version or a thumbnail from an original (of course, 
>    the info itself would come from the outside)
>    3. the immediate need is that data is (and stays securely) local, but 
>    later it could be nice to easily share some media files with specific users
>
> If this has been asked or treated before, please excuse my ignorance and 
> point me towards relevant sources.
>
> Best, Diemo
>

-- 
You received this message because you are subscribed to the Google Groups 
"Perkeep" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/perkeep/9e055690-2441-40fb-bfd9-e205965d341an%40googlegroups.com.

Reply via email to