https://bugs.kde.org/show_bug.cgi?id=416630

            Bug ID: 416630
           Summary: Use N nearest neighbor search
           Product: digikam
           Version: 7.0.0
          Platform: Other
                OS: Linux
            Status: REPORTED
          Severity: normal
          Priority: NOR
         Component: Faces-Recognition
          Assignee: digikam-bugs-n...@kde.org
          Reporter: v...@tym.im
  Target Milestone: ---

Currently recognition match is done using an average of all matches to a given
person. This approach does not work well for people with high number of
examples done at different age / hair color / angle. What makes things worse
adding more examples usually makes matching worse as it means there are always
distant examples outnumbering near ones. 

I tried nearest neighbor and it's pretty noise. What works best for me is to
take average of N nearest examples (I tried 5 and 10). It eliminates noise yet
finding great matches.

Another change that gave me good results is using adjusted cosine distance
instead of regular one: each feature is normalized by it's mean across whole
example database. E.g. if feature mean is high (e.g. 0.6), it does not have
much effect on cosine as almost all vectors would be pointing into "positive"
direction, while adjusted (n-0.6) will have vectors pointing into different
sides providing meaningful input. It reduces overall similarly (I have to use
0.7 instead of 0.8-0.9 to find examples), but general quality seems better.

Note that I did not do a formal accuracy check for both changes yet. Tried to
do it fast, but libraries produce "accuracy" that is low and rarely applicable
as it does not take into account accuracy threshold (drop everything with match
less than T)

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to