You're right- that error should not be thrown. You are not asking for a sort. I don't know that one. You could try starting over with the Solr 1.4.1 release binaries.
Jakub Godawa wrote:
Hi Erick, thanks for your help! I need some technical help though... let me put it that way: 1. I deleted everything in index with: curl http://localhost:8983/solr/update -F stream.body=' <delete><query>*:*</query></delete>' curl http://localhost:8983/solr/update -F stream.body='<commit />' 2. I created 2 documents with fields: name_en, answer_en, name_es, answer_es 3. I made a query through admin page, with response: <response> - <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">9</int> - <lst name="params"> <str name="indent">on</str> <str name="start">0</str> <str name="q">Jakub </str> <str name="version">2.2</str> <str name="rows">10</str> </lst> </lst> - <result name="response" numFound="2" start="0"> - <doc> - <arr name="answer_en_t"> <str>My name is Jakub</str> </arr> - <arr name="answer_es_t"> <str>Me llamo Jakub.</str> </arr> - <arr name="id"> <str>Question:1</str> </arr> - <arr name="name_en_t"> <str>What is your name?</str> </arr> - <arr name="name_es_t"> <str>Como te llamas?</str> </arr> - <arr name="pk_s"> <str>1</str> </arr> - <arr name="spell"> <str>What is your name?</str> <str>My name is Jakub</str> <str>Como te llamas?</str> <str>Me llamo Jakub.</str> </arr> </doc> - <doc> - <arr name="answer_en_t"> <str>I am in the kitchen Jakub!</str> </arr> - <arr name="answer_es_t"> <str>Estoy en la cocina.</str> </arr> - <arr name="id"> <str>Question:2</str> </arr> - <arr name="name_en_t"> <str>Where are you?</str> </arr> - <arr name="name_es_t"> <str>Donde estas?</str> </arr> - <arr name="pk_s"> <str>2</str> </arr> - <arr name="spell"> <str>Where are you?</str> <str>I am in the kitchen Jakub!</str> <str>Donde estas?</str> <str>Estoy en la cocina.</str> </arr> </doc> </result> </response> 4. Now I needed two dismaxes to make it work in two separate languages. Lets say I just want to look up in *_en fields, then I created a dismax: <requestHandler name="/English" class="solr.SearchHandler"> <lst name="defaults"> <str name="defType">dismax</str> <str name="echoParams">explicit</str> <float name="tie">0.01</float> <str name="qf"> name_en_t^0.5 answer_en_t^1.0 </str> </lst> </requestHandler> 5. Hitting the url: http://localhost:8982/solr/English/?q=Jakub gaves me an error: there are more terms than documents in field "name_en_t", but it's impossible to sort on tokenized fields 6. I know that I should create a separate dismax for Spanish. My questions: 1. Why those fields are named with *_t? I saw in schema.xml that they are made dynamicly. Can/should I create my own predefined fields in schema.xml? Is this the place where you put "HOW" the field should be interpreted by indexer? 2. Why the error in no. 5 is being thrown? I know that you cannot do sorting on tokenized fields, but I don't see myself trying to index anything nor tokenizing. 3. How should it be changed to work properly? Thank you and I ask for patience as this can help many rookies like to me to get started. Jakub. 2010/10/21 Erick Erickson<erickerick...@gmail.com>See below: But also search the archives for multilanguage, this topic has been discussed many times before. Lucid Imagination maintains a Solr-powered (of course) searchable list at: http://www.lucidimagination.com/search/ <http://www.lucidimagination.com/search/> On Wed, Oct 20, 2010 at 9:03 AM, Jakub Godawa<jakub.god...@gmail.comwrote:Hi everyone! (my first post) I am new, but really curious about usefullness of lucene/solr indocumentssearch from the web applications. I use Ruby on Rails to create one, with plugin "acts_as_solr_reloaded" that makes connection between web app and solr easy. So I am in a point, where I know that good solution is to prepare multi-language documents with fields like: question_en, answer_en, question_fr, answer_fr, question_pl, answer_pl... etc. I need to create an index that would work with 6 languages: english, french, german, russian, ukrainian and polish. My questions are: 1. Is it doable to have just one search field that behaves like Google's for all those documents? It can be an option to indicate a language tosearch.This depends on what you mean by do-able. Are you going to allow a French user to search an English document (& etc)? But the real answer is "yes, you can if you .....". There'll be tradeoffs. Take a look at the dismax handler. It's kind of hard to grok all at once, but you can cause it to search across multiple fields. That is, the user types "language", and you can turn it into a complex query under the covers like lang_en:language lang_fr:language lang_ru:language, etc. You can also apply boosts. Note that this has obvious problems with, say, Russian. Half your job will be figuring out what will satisfy the user..... You could also have a #different# dismax handler defined for various languages. Say the user was coming from Spanish. Consider a browseES handler. See solrconfig.xml for the default dismax handler. The Solr book mentioned above describes this.2. How should I begin changing the solr/conf/schema.xml (or other) filetotailor it to my needs? As I am a real rookie here, I am still a bit confused about "fields", "fieldTypes" and their connection with particular field (ex. answer_fr) and the "tokenizers" and "analyzers". If someone can provide a basic step by step tutorial on how to make it work in two languages Iwouldbe more that happy.You have several choices here:books "Lucene in Action" and "Solr 1.4, Enterprise SearchServer" bothhave discussions here.Spend some time on the solr/admin/analysis page. That page allows you tosee pretty much exactly what each of the steps in an analyzer chain accomplish.3. Do all those languages are supported (officially/unofficialy) by lucene/solr?See: http://lucene.apache.org/java/3_0_2/api/all/org/apache/lucene/analysis/Analyzer.html Remember that Solr is built on Lucene, so these analyzers are available.Thank you for help, Jakub Godawa.Best Erick