Thre's approximately a 100% chance that you are going to go through a server 
side langauge(php, ruby, pearl, java, VB/asp/,net[cough,cough]), before you get 
to Solr/Lucene. I'd recommend it anyway.

This code will should look at the user's browser locale (en_US, pl_PL, es_CO, 
etc). The server side langauge would then choose wich language to search by and 
display.

NOW, that being said, are you going to have the exact same content for all 
langauges, just translated? The temptation would be to translate to a common 
language like English, then do the search, then get the translation. I wouln'dt 
recommend it, but I'm no expert. Translation of single words can be OK, but 
mulitword ideas and especially sentences doesn't work so well that way.

you probably will have separate content for that reason, AND another. Different 
cultures are interested in different things and only have common ground on 
cetain things like international news (but with different opinions) and medical 
news. So different content for differnt cultures speaking different languages.

Are you tryihg to address differnt languages in some place like the US or Great 
Britain, with LOTS of different languages spoken in minority cultures? Only 
then would you want a geographically centered server and information gathering 
organization. If you were going to have search for other countries, then I'd 
recommend those resources be geogrpahically close to their source culture.
Dennis Gearon

Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a 
better idea to learn from others’ mistakes, so you do not have to make them 
yourself. from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'

EARTH has a Right To Life,
  otherwise we all die.


--- On Wed, 10/20/10, Jakub Godawa <jakub.god...@gmail.com> wrote:

> From: Jakub Godawa <jakub.god...@gmail.com>
> Subject: Step by step tutorial for multi-language indexing and search
> To: solr-user@lucene.apache.org
> Date: Wednesday, October 20, 2010, 6:03 AM
> Hi everyone! (my first post)
> 
> I am new, but really curious about usefullness of
> lucene/solr in documents
> search from the web applications. I use Ruby on Rails to
> create one, with
> plugin "acts_as_solr_reloaded" that makes connection
> between web app and
> solr easy.
> 
> So I am in a point, where I know that good solution is to
> prepare
> multi-language documents with fields like:
> question_en, answer_en,
> question_fr, answer_fr,
> question_pl,  answer_pl... etc.
> 
> I need to create an index that would work with 6 languages:
> english, french,
> german, russian, ukrainian and polish.
> 
> My questions are:
> 1. Is it doable to have just one search field that behaves
> like Google's for
> all those documents? It can be an option to indicate a
> language to search.
> 2. How should I begin changing the solr/conf/schema.xml (or
> other) file to
> tailor it to my needs? As I am a real rookie here, I am
> still a bit confused
> about "fields", "fieldTypes" and their connection with
> particular field (ex.
> answer_fr) and the "tokenizers" and "analyzers". If someone
> can provide a
> basic step by step tutorial on how to make it work in two
> languages I would
> be more that happy.
> 3. Do all those languages are supported
> (officially/unofficialy) by
> lucene/solr?
> 
> Thank you for help,
> Jakub Godawa.
>

Reply via email to