Re: Multi-language indexing and searching

2007-06-21 Thread Daniel Alheiros
Hi Hoss. I've tried that yesterday using the same approach you just said (I've created the base fields for any language with basic analyzers) and it worked alright. Thanks again for you time. Regards, Daniel On 20/6/07 21:00, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : So far it sound

Re: Multi-language indexing and searching

2007-06-20 Thread Chris Hostetter
: So far it sounds good for my needs, now I'm going to try if my other : features still work (I'm worried about highlighting as I'm going to return a : different field)... i'm not really a highlighting guy so i'm not sure ... but if you're okay with *simple* highlighting you can probably just hig

Re: Multi-language indexing and searching

2007-06-20 Thread Daniel Alheiros
Hi Hoss Thanks again for your attention. Looks like after your last instructions I thought the same way as you :) What I did yesterday: 1. Created the schema with the fields with language variations (created as concrete fields anyway because in this case, using dynamic it wouldn't be better for

Re: Multi-language indexing and searching

2007-06-19 Thread Chris Hostetter
: range wouldn't be a problem in this case. The real issue I can see in this : approach, is related to Analyzers... How to make them deal with different : languages properly using one Solr instance with the same set of fields being : used by documents in different languages i would still use

Re: Multi-language indexing and searching

2007-06-19 Thread Daniel Alheiros
Hi Hoss. Yes, the idea is indexing each document independently (in my scenario they are not translations, they are just documents with the same structure but different languages). So that considerations you did about queries in a range wouldn't be a problem in this case. The real issue I can see i

Re: Multi-language indexing and searching

2007-06-15 Thread Chris Hostetter
: One bad thing in having fields specific for your language (in my point of : view) is that you will have to re-index your content when you add a new : language (some will need to start with one language and in future will have : others added). But OK, let's say the indexing is done. i don't see

Re: Multi-language indexing and searching

2007-06-13 Thread Daniel Alheiros
Hi Hoss One bad thing in having fields specific for your language (in my point of view) is that you will have to re-index your content when you add a new language (some will need to start with one language and in future will have others added). But OK, let's say the indexing is done. So using dyn

RE: Multi-language indexing and searching

2007-06-12 Thread Chris Hostetter
: Due to the prolification of number of fields. Say, we want : to have the field "title" to have the title of the book in : its original language. But because Solr has this implicit : assumption of one language per field, we would have to have : the artifitial fields title_fr, title_de, title_en

RE: Multi-language indexing and searching

2007-06-12 Thread Teruhiko Kurosaka
Hi Yonik, > On 6/12/07, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote: > > For bi-lingual > > or tri-lingual search, we can have parallel fields (title_en, > > title_fr, title_de, for example) but this wouldn't scale well. > > Due to search across multiple fields, or due to increased index size? D

RE: Multi-language indexing and searching

2007-06-12 Thread Ken Krugler
Daniel, I was reading your email and responses to it with great interest. I was aware that Solr has an implicit assumption that a field is mono-lingual per system. But your mail and its correspondence made me wonder if this limitation is practical for multi-lingual search applications. For bi-li

Re: Multi-language indexing and searching

2007-06-12 Thread Daniel Alheiros
Hi Yonik. About how to handle with the index in query time: I think that if you don't inform a language, you can return any document matching the term, without considering different languages (if it's possible) or if it's interesting for your solution, you can define a default language to be used

Re: Multi-language indexing and searching

2007-06-12 Thread Yonik Seeley
On 6/12/07, Teruhiko Kurosaka <[EMAIL PROTECTED]> wrote: For bi-lingual or tri-lingual search, we can have parallel fields (title_en, title_fr, title_de, for example) but this wouldn't scale well. Due to search across multiple fields, or due to increased index size? Lucene and Solr requires t

RE: Multi-language indexing and searching

2007-06-12 Thread Teruhiko Kurosaka
Daniel, I was reading your email and responses to it with great interest. I was aware that Solr has an implicit assumption that a field is mono-lingual per system. But your mail and its correspondence made me wonder if this limitation is practical for multi-lingual search applications. For bi-

Re: Multi-language indexing and searching

2007-06-11 Thread Daniel Alheiros
Hi Henri, Thanks again, your considerations will sure help on my decision. Now I'll do my homework to check document volume / growth - expected index sizes and query load. Regards, Daniel Alheiros On 9/6/07 10:53, "Henrib" <[EMAIL PROTECTED]> wrote: > > Hi Daniel, > Trying to recap: you are i

Re: Multi-language indexing and searching

2007-06-11 Thread Daniel Alheiros
This sounds OK. I can create a field name mapping structure to change the requests / responses in a way my client doesn't need to be aware of different fields. Thanks for this directions, Daniel On 8/6/07 21:32, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > > : Can't I have the same index, u

Re: Multi-language indexing and searching

2007-06-09 Thread Henrib
>>> Do not use, copy or disclose the information in any way nor act in >>> reliance on it and notify the sender immediately. >>> Please note that the BBC monitors e-mails sent or received. >>> Further communication will signify your consent to this. >>> >>> >>> > > > http://www.bbc.co.uk/ > This e-mail (and any attachments) is confidential and may contain personal > views which are not the views of the BBC unless specifically stated. > If you have received it in error, please delete it from your system. > Do not use, copy or disclose the information in any way nor act in > reliance on it and notify the sender immediately. > Please note that the BBC monitors e-mails sent or received. > Further communication will signify your consent to this. > > > -- View this message in context: http://www.nabble.com/Multi-language-indexing-and-searching-tf3885324.html#a11038890 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multi-language indexing and searching

2007-06-08 Thread Chris Hostetter
: Can't I have the same index, using one single core, same field names being : processed by language specific components based on a field/parameter? yes, but you don't really need the complexity you describe below ... you don't need seperate request handlers per language, just seperate fields per

Re: Multi-language indexing and searching

2007-06-08 Thread Daniel Alheiros
Hi Henri. Thanks for your reply. I've just looked at the patch you referred, but doing this I will lose the out of the box Solr installation... I'll have to create my own Solr application responsible for creating the multiple cores and I'll have to change my indexing process to something able to n

Re: Multi-language indexing and searching

2007-06-08 Thread Henrib
> Further communication will signify your consent to this. > > > -- View this message in context: http://www.nabble.com/Multi-language-indexing-and-searching-tf3885324.html#a11027333 Sent from the Solr - User mailing list archive at Nabble.com.

Re: Multi-language indexing and searching

2007-06-08 Thread Daniel Alheiros
Thank you for your reply. Yes, I realize that hitting a query against the hole content would come with this problems, but what I'm trying to say is that I will always narrow by the language (from my users point of view). I would like to know if it is possible (and appropriate) to have all my conte

Re: Multi-language indexing and searching

2007-06-07 Thread Walter Underwood
I'm not sure what sort of "field" you mean for defining the language. If you plan to use a single search UI regardless of language, we used to do this in Ultraseek, but it doesn't really work. Queries are too short for reliable language ID (is "die" in German, English, or Latin?), and language-spe

Multi-language indexing and searching

2007-06-07 Thread Daniel Alheiros
Hi, I'm just starting to use Solr and so far, it has been a very interesting learning process. I wasn't a Lucene user, so I'm learning a lot about both. My problem is: I have to index and search content in several languages. My scenario is a bit different from other that I've already read in th