https://bugs.kde.org/show_bug.cgi?id=410680

--- Comment #8 from skierpage <skierp...@gmail.com> ---
(In reply to tagwerk19 from comment #6)

> I feel I don't always see all of baloo's error messages but the trick of
> running "balooctl purge" means that they start appearing on screen. I then
> get:
> 
>     Invalid encoding. Ignoring "/home/test/stadyn_largepagewithimages.html"

I think that's related to whether you see `qDebug()` output, not restarting
balooctl.

I filed bug 440537 that KFileMetadata's plaintext extractor should handle other
character encodings.

> Ideally this file would be flagged "failed to index"
Great idea, file a bug. What happens is the contents of all the lines of the
text file up to the one with the invalid character are indexed, so it's more
"Incompletely indexed" (which is even more frustrating!).

(In reply to tagwerk19 from comment #7)
> No problems up to about 10MB.
> Above that, and it seems something of a "rough" number, the files are not
> indexed.

Yup, that is an undocumented limit for text files in Baloo's file processing. I
added a new section https://community.kde.org/Baloo#Indexing_limitations with
what I've learned. I still haven't figured out what goes wrong indexing terms
far down in large-but-not-10MB UTF-8 files.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to