Le 03-juil.-09 à 07:43, Michael Lackhoff a écrit :
On 03.07.2009 00:49 Paul Libbrecht wrote: [I'll try to address the other responses as well]I believe the proper way is for the server to compute a list of accepted languages in order of preferences. The web-platform language (e.g. the user-setting), and the values in the Accept-Language http header (which are from the browser or platform).All this is not going to help much because the main application is a scientific search portal for books and articles with many users searching cross-language. The most typical use case is a German user searching multilingual. So we might even get the search multilingual, e.g. TITLE:cancer OR TITLE:krebs. No way here to watch out for Accept-headers or a language select field (would be left on "any" in most cases). Other popular use cases are citations (in whatever language) cut and pasted into the search field.
The algorithm I described does take all this in account: the ambiguity of the query's language. You have no other way to offer any form of stemming in each language (e.g. removing -ing and removing -ung) than to actually do this. Is it because you use solr directly that languages can't be passed around?
You need a server part to get the headers, indeed.Oh, and yes, you have to double all what I described to prefer matches in the title btw. We've implemented something that might be close to what you're search, i2geo search which approaches much closer the cross-lingual problem by request entity designation:
It's under APL.Try to search for, say, Viereck in the search box. See a little description at:
http://i2geo.net/xwiki/bin/view/About/GeoSkills
I think the best would be to process the data according to its language but don't make any assumptions about the query language and I am totallylost how to get a clever schema.xml out of all this.
just or them properly.Storing different languages in different fields (title-de, title-en) is the right way to get the schema.xml properly configured with an analyzer I think.
paul
smime.p7s
Description: S/MIME cryptographic signature