https://bugs.kde.org/show_bug.cgi?id=432717
Bug ID: 432717 Summary: Baloo scans content from too many files Product: frameworks-baloo Version: unspecified Platform: Other OS: Linux Status: REPORTED Severity: normal Priority: NOR Component: Baloo File Daemon Assignee: stefan.bru...@rwth-aachen.de Reporter: c...@palacio.io Target Milestone: --- I have disabled file content indexing because it not only takes a great toll on I/O disk usage in my system, but it scans and indexes useless program data files content. I have a few Wine prefixes in plain view in unhidden folders in my home, so quite a lot of data files are accessible to Baloo with a default configuration. I have caught Baloo scanning and indexing keywords of a Daz Studio data file. For example: $ balooshow -x "/home/user/Wine/Daz/drive_c/users/Public/Documents/My DAZ 3D Library/data/DAZ 3D/Genesis 8/Male/Morphs/DAZ 3D/Base Pose Head/alias_head_eCTRLEyelidsUpperUp-DownL.dsf" 425600051801229316 2052 99092734 /home/user/Wine/Daz/drive_c/users/Public/Documents/My DAZ 3D Library/data/DAZ 3D/Genesis 8/Male/Morphs/DAZ 3D/Base Pose Head/alias_head_eCTRLEyelidsUpperUp-DownL.dsf Mtime: 1503348208 2017-08-21T15:43:28 Ctime: 1567044300 2019-08-28T21:05:00 Cached properties: Line Count: 44 Internal Info Terms: 0.2784314 0.3254902 0.3764706 0.6.0.0 06 1 1.0 2017 203d 208 20head 20pose 21t23 27 34z 3d Mplain Mtext T5 T8 X20-44 alias asset author base channel colors com contributor controls data daz daz3d description down downl dsf ectrleyelidsupperup email eyelids eyes file genesis genesis8male group head http icon id info label large left library male modified modifier modifiers morphs name parent pose presentation revision scene support target type up upper url value version website www File Name Terms: Falias Fdownl Fdsf Fectrleyelidsupperup Fhead XAttr Terms: lineCount: 44 I can't imagine the amount of program data it might have indexed from my home folder. In my opinion, Baloo should restrict itself to a very limited selection of files to extract keywords from. There's bug #358098 that is related to this issue. I disagree strongly with it. Sure, it might interest a few people to scan more files but that is a potentially harmful default for most users. Unknown data should be skipped, source code should be skipped. There should be a more simple default. A extension blacklist isn't the appropiate solution, a whitelist is. SOFTWARE/OS VERSIONS Linux/KDE Plasma: Debian unstable KDE Plasma Version: 5.78.0 KDE Frameworks Version: 5.20.5 Qt Version: 5.15.2 -- You are receiving this mail because: You are watching all bug changes.