https://bugs.kde.org/show_bug.cgi?id=438382

            Bug ID: 438382
           Summary: after baloo indexes some files more than once you
                    can't clean this up
           Product: frameworks-baloo
           Version: 5.82.0
          Platform: Fedora RPMs
                OS: Linux
            Status: REPORTED
          Severity: normal
          Priority: NOR
         Component: general
          Assignee: stefan.bru...@rwth-aachen.de
          Reporter: skierp...@gmail.com
                CC: baloo-bugs-n...@kde.org, n...@kde.org
  Target Milestone: ---

SUMMARY
Some baloo searches return the same file numerous times. Once this happens it
seems impossible to clean up.

STEPS TO REPRODUCE
1. Run `balooctl monitor`
2. Find a file that's in baloo's index multiple times. I found some at random,
then searched for a common term ("the") and sorted to find files indexed
multiple times: in a terminal, enter `baloosearch the | sort | uniq -c | sort
-nr | head -10`
3. Clear the file from Baloo with `balooctl clear /path/to/file`
4. Repeat the baloosearch
5. Make a backup of the file somewhere not indexed (e.g. /tmp) and delete the
file on-disk with `rm`
6. Repeat the baloosearch
7. Copy the backup back to the file location.
8. Repeat the baloosearch

OBSERVED RESULT
Two files that I edit a lot in vim appear in baloosearch results 6 and 7 times
respectively, I also found a few other text files indexed twice, plus I have a
.xlsx spreadsheet that appears twice.
Running `balooctl clear /path/to/file` either does nothing or seems to remove
one instance of the file in baloosearch results. Baloo doesn't realize the file
is in its DB multiple times.
Deleting the file does not remove any results from baloosearch, and `balooctl
monitor` doesn't output anything.
Restoring the deleted file (I copied the backup back to it) adds another copy
of it to Baloo's index.

You can't run `balooctl clear /path/to/file.txt` if the file doesn't exist.

EXPECTED RESULT
Baloo should never return the same file multiple times.
Deleting a file on-disk should clear it from Baloo's index.
`balooctl clear` should remove every entry for the file in Baloo's index.
Maybe `balooctl clear` should work even if the file does not exist on-disk.


SOFTWARE/OS VERSIONS

Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 5.21.5
KDE Frameworks Version: 5.82.0
Qt Version: 5.15.2 on Wayland

ADDITIONAL INFORMATION
The files that appear in Baloo's index multiple times are all on a mounted NTFS
volume that I told Baloo to index, but the behavior that deleting a file
doesn't remove it from Baloo's index happens on an ext4 volume as well.

Once a file appears in Baloo search results more than once, I can make it
appear N+1 times by deleting it on-disk and copying a backup; but this doesn't
work if the file only appears once.

The files that appear in Baloo index multiple times for common words appear
fewer times for other words that I added to them more recently.

https://community.kde.org/Baloo mentions a `balooctl checkDb` command that
seems useful (and then cautions against running it), but balooctl no longer
offers this subcommand.

I didn't try rebuilding baloo's index. Despite these glitches baloo has been
well for me 👍❤️

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to