Le 03-juil.-09 à 07:43, Michael Lackhoff a écrit :

On 03.07.2009 00:49 Paul Libbrecht wrote:

[I'll try to address the other responses as well]

I believe the proper way is for the server to compute a list of
accepted languages in order of preferences.
The web-platform language (e.g. the user-setting), and the values in
the Accept-Language http header (which are from the browser or
platform).

All this is not going to help much because the main application is a
scientific search portal for books and articles with many users
searching cross-language. The most typical use case is a German user
searching multilingual. So we might even get the search multilingual,
e.g. TITLE:cancer OR TITLE:krebs. No way here to watch out for
Accept-headers or a language select field (would be left on "any" in
most cases). Other popular use cases are citations (in whatever
language) cut and pasted into the search field.

The algorithm I described does take all this in account: the ambiguity of the query's language. You have no other way to offer any form of stemming in each language (e.g. removing -ing and removing -ung) than to actually do this. Is it because you use solr directly that languages can't be passed around?
You need a server part to get the headers, indeed.
Oh, and yes, you have to double all what I described to prefer matches in the title btw. We've implemented something that might be close to what you're search, i2geo search which approaches much closer the cross-lingual problem by request entity designation:
It's under APL.

Try to search for, say, Viereck in the search box. See a little description at:
  http://i2geo.net/xwiki/bin/view/About/GeoSkills

I think the best would be to process the data according to its language but don't make any assumptions about the query language and I am totally
lost how to get a clever schema.xml out of all this.

just or them properly.
Storing different languages in different fields (title-de, title-en) is the right way to get the schema.xml properly configured with an analyzer I think.

paul

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to