https://bugs.kde.org/show_bug.cgi?id=420939

--- Comment #48 from Scott <shagooser...@gmail.com> ---
To try and understand better what is going on I decided to delete the index
file and start from scratch. I have also piped the terminal output to a
file to better assist me to do this. I will list the issues I come across
as I come across them.

1/ Baloo indexer terminated again but I was not able to regain the prompt
by pressing enter.
2/ I think I may have confused the issue of indexing with displaying
duration by using them interchangeably. I have now discovered that some
files are indexed but do not display a duration. The following file appears
in the indexed list but there is no duration displayed:
    Indexing: /mnt/pool/Entertainment/Documentaries/Movies/Battle of
Jutland the Navys Bloodiest Day 2016.ts: Ok
    xdg-mime query filetype "Battle of Jutland the Navys Bloodiest Day
2016".ts = video/mp2t
    balooshow -x "Battle of Jutland the Navys Bloodiest Day 2016".ts =
f8c500000031
49 63685 Battle of Jutland the Navys Bloodiest Day 2016.ts: No index
information found
3/ I then did a balooctl disable > balooctl enable and compared what was
indexed this time:
    First index 5491 files indxed
    Second index 5345 files indexed
    The 2 files seem totally unrelated with both significant overlap and
differences. The file size is seemingly unrelated, some large files are
indexed and many small files are not.
4/ The Jutland film was indexed on both occasions.
    stat "Battle of Jutland the Navys Bloodiest Day 2016".ts
    File: Battle of Jutland the Navys Bloodiest Day 2016.ts
    Size: 1299907200      Blocks: 2538896    IO Block: 4096   regular file
    Device: 31h/49d Inode: 70217       Links: 1
    Access: (0775/-rwxrwxr-x)  Uid: ( 1001/  shagoo)   Gid: (
1002/shagadmin)
    Access: 2021-08-07 08:50:24.837378054 +0800
    Modify: 2016-06-03 11:55:55.180896791 +0800
    Change: 2021-08-02 20:01:39.015217906 +0800
5/ This seems to be your "red flag"
    baloosearch -i "Battle of Jutland the Navys Bloodiest Day 2016".ts
    f3f600000031 /mnt/pool/Entertainment/Documentaries/Movies/Battle of
Jutland the Navys Bloodiest Day 2016.ts
    f8c500000031 /mnt/pool/Entertainment/Documentaries/Movies/Battle of
Jutland the Navys Bloodiest Day 2016.ts
    Elapsed: 0.997359 msecs
6/ No duration was displayed for this file on the first indexing or the
second.

These results were obtained consecutively without any other programs being
run or re-booting the computer.

On Fri, Aug 6, 2021 at 4:11 PM <bugzilla_nore...@kde.org> wrote:

> https://bugs.kde.org/show_bug.cgi?id=420939
>
> --- Comment #47 from tagwer...@innerjoin.org ---
> (In reply to Scott from comment #46)
>
> No problem, we carry on troubleshooting.
>
> > I think the problem is more than just misidentifying mime types.
> Finding out about the mimetypes and that baloo would never attempt to index
> some files was one step along the way. Good to find out but there's more
> to do.
>
> > 3/ Further it reports files waiting to be indexed and files failed to
> index
> > both being zero when in fact approximately 1,000 of the 6,000 files in
> the
> > dataset have not been indexed. I have restarted baloo repeatedly and they
> > never get indexed, it re-indexes what it had before.
> It's possible that we've got another mimetype issue with these files, or
> they
> are your 1000 biggest files, or something else. I think copy one of them to
> your home directory and check with
>
>     xdg-mime query filetype ...newstrangefile...
>
> Check that the mimetype is sensible, then see what
>
>     balooshow -x ...newstrangefile...
>
> says.
>
> > 1/ baloo terminates during indexing for unknown reasons (not
> > hanging/freezing as I erroneously stated previously) without providing a
> > reason code.
> I'll ask a bit more about this. Your "balooctl status" output says
>
> > Baloo File Indexer is running
> > Indexer state: Idle
> That's what baloo says when it's alive and thinks it has nothing more to
> do.
> There is the content indexer process "baloo_file_extractor" that is run
> when
> there is indexing necessary, does its job, saves the results, stops and is
> run
> again when there is more to do. This would/should happen in the background
> and
> you wouldn't see exit codes.
>
> > 2/ On restarting the indexing baloo re-indexes the same files with an
> > erroneous message that the files have changed (see my last email) or
> added
> > with baloo being turned off. Baloo is not checking that these index
> entries
> > already exist or there is some problem with the index file itself and so
> > just duplicates them which is why baloo reports over 21,000 files indexed
> > from a dataset only containing 6,000 entries.
> The error message is a:
>
> > ... id seems to have changed. Perhaps baloo was not running, and this
> file was deleted + re-created
> Need to check the Id and see if it is really changing. Ask with "stat",
> you'll
> get something like:
>
>     $ stat 1.ts
>       File: 1.ts
>       Size: 41416704        Blocks: 80896      IO Block: 4096   regular
> file
>     Device: fc01h/64513d    Inode: 794964      Links: 1
>     Access: (0664/-rw-rw-r--)  Uid: ( 1000/    test)   Gid: ( 1000/
> test)
>     Access: 2021-07-24 22:50:57.838161084 +0200
>     Modify: 2021-07-24 22:50:57.838161084 +0200
>     Change: 2021-07-24 22:51:42.686181710 +0200
>     Birth: -
>
> It's the "Device" and "Inode" numbers that you need to keep you eye on.
> The:
>
>     Device: fc01h/64513d    Inode: 794964
>
> If you reboot and these change, baloo will think it's got a new file and
> try to
> index it again. Keep a note of the numbers, check again after a reboot and
> compare.
>
> You could also try a baloosearch for one of the files that always seems to
> be
> reindexed
>
>     $ baloosearch -i ...oneofyoursavedfiles...
>
> If you are OK, baloosearch will give a single result, if the id has been
> changing, "baloosearch -i" would show several lines - with different ID
> numbers
> and the same file/pathname. Something like:
>
>     $ baloosearch -i testfile
>     9ca00000028 /home/test/testfile
>     9ca0000002a /home/test/testfile
>     9ca0000002c /home/test/testfile
>
> That would be a red flag...
>
> > I had to disable baloo because it somehow seriously interferes with my
> > ability to move files from the admin PC to the server. With baloo running
> > on the server any attempt to transfer files to it results in very slow
> > transfer speeds and on occasion failure to complete the move and this is
> > occuring while the indexer is reporting idle.
> I can only guess where - but you are indexing *really* large files, and
> there
> were a couple of fixes two months ago to stop a Mime lookup read the whole
> file
> into memory. Bug 398908, fixed according to
>    https://bugs.kde.org/show_bug.cgi?id=398908#c97
> with 5.83. If you don't have this version, maybe the best thing to do it
> wait
> until it gets to you with an update.
>
> --
> You are receiving this mail because:
> You reported the bug.

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to