[frameworks-baloo] [Bug 362647] Can't search with Chinese characters

Stefan Brüns Tue, 20 Mar 2018 13:35:05 -0700

https://bugs.kde.org/show_bug.cgi?id=362647


Stefan Brüns <stefan.bru...@rwth-aachen.de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |stefan.bruens@rwth-aachen.d
                   |                            |e

--- Comment #7 from Stefan Brüns <stefan.bru...@rwth-aachen.de> ---
I think a good start would be to create a database of testcases, so even a
developer not proficient in a specific script can test and improve the
coverage.

One possible format could be:

# description of the testcase
! filename_1234.png
+ filename ;; match filename
+ png ;; match png
+ 1234 ;; match 1234
- file ;; do not match file
+ file* ;; match with wildcard

# chinese testcase 1: match 測試 ("test")
! 測試testcase.txt
+ 測試
+ testcase
+ txt
;- 試測 ;; expected to fail, requires dictionary
+ 測試 txt

Testcases should probably be split into one file per language (combination).

Correct handling of languages like chinese - where words are not separated - 
is hard, and requires a dictionary as far as I understand. The best one can do
currently is to split at a grapheme level. This would likely create a lot of
false positives when searching, but false positives are IMHO better than no
results at all.

-- 
You are receiving this mail because:
You are watching all bug changes.

[frameworks-baloo] [Bug 362647] Can't search with Chinese characters

Reply via email to