Hi guys, I'm running to some problems with accented (UTF-8) language. I'd love to hear some ideas about how to use Solr with those languages. Basically, I want to achieve what Google did with UTF-8 language.
My requirements including: 1) Accent insensitive search and proper highlighting: For example, we have 2 documents: Doc A (title:Lập Trình Viên) Doc B (title:Lap Trinh Vien) if the user enters "Lập Trình Viên", then Doc B is also matched and "Lập Trình Viên" is highlighted. On the other hand, if the query is "Lap Trinh Vien", Doc A is also matched. 2) Assign proper scores to accented or non-accented searches: if the user enters "Lập Trình Viên", then Doc A should be given higher score than DOC B. if the query is "Lap Trinh Vien", Doc A should be given higher score. Any ideas guys? Thanks in advance! -- Regards, Cuong Hoang