date:20081218

Re: Standard request with functional query

2008-12-18 Thread Chris Hostetter


: Thanks for the response, but how would make recency a factor on
: scoring documents with the standard request handler.
: The query (title:iphone OR bodytext:iphone OR title:firmware OR
: bodytext:firmware) AND _val_:"ord(dateCreated)"^0.1
: seems to do something very similar to just sorting by dateCreated
: rather than having dateCreated being a part of the score.

you have to look at the score explanations (debugQuery=true) and decide 
what boost is appropriate.  there are no magic numbers that work for 
everyone.

: 
: Thanks,
: Sammy
: 
: n Thu, Dec 4, 2008 at 1:35 PM, Sammy Yu  wrote:
: > Hi guys,
: >I have a standard query that searches across multiple text fields such as
: > q=title:iphone OR bodytext:iphone OR title:firmware OR bodytext:firmware
: >
: > This comes back with documents that have iphone and firmware (I know I
: > can use dismax handler but it seems to be really slow), which is
: > great.  Now I want to give some more weight to more recent documents
: > (there is a dateCreated field in each document).
: >
: > So I've modified the query as such:
: > (title:iphone OR bodytext:iphone OR title:firmware OR
: > bodytext:firmware) AND _val_:"ord(dateCreated)"^0.1
: > URLencoded to 
q=(title%3Aiphone+OR+bodytext%3Aiphone+OR+title%3Afirmware+OR+bodytext%3Afirmware)+AND+_val_%3A"ord(dateCreated)"^0.1
: >
: > However, the results are not as one would expects.  The first few
: > documents only come back with the word iphone and appears to be sorted
: > by date created.  It seems to completely ignore the score and use the
: > dateCreated field for the score.
: >
: > On a not directly related issue it seems like if you put the weight
: > within the double quotes:
: > (title:iphone OR bodytext:iphone OR title:firmware OR
: > bodytext:firmware) AND _val_:"ord(dateCreated)^0.1"
: >
: > the parser complains:
: > org.apache.lucene.queryParser.ParseException: Cannot parse
: > '(title:iphone OR bodytext:iphone OR title:firmware OR
: > bodytext:firmware) AND _val_:"ord(dateCreated)^0.1"': Expected ',' at
: > position 16 in 'ord(dateCreated)^0.1'
: >
: > Thanks,
: > Sammy
: >
: 



-Hoss

Re: [ANNOUNCE] Solr Logo Contest Results

2008-12-18 Thread Lukáš Vlček

Congratulations Michiel.Lukas

On Thu, Dec 18, 2008 at 3:44 AM, Matt Mitchell  wrote:

> Love it! Congratulations Michiel.
>
> Matt
>
> On Wed, Dec 17, 2008 at 9:15 PM, Chris Hostetter
> wrote:
>
> > (replies to solr-user please)
> >
> > On behalf of the Solr Committers, I'm happy to announce that we the Solr
> > Logo Contest is officially concluded. (Woot!)
> >
> > And the Winner Is...
> >
> >
> https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg
> > ...by Michiel
> >
> > We ran into a few hiccups during the contest making it take longer then
> > intended, but the result was a thorough process in which everyone went
> above
> > and beyond to ensure that the final choice best reflected the wishes of
> the
> > community.
> >
> > You can expect to see the new logo appear on the site (and in the Solr
> app)
> > in the next few weeks.
> >
> > Congrats Michiel!
> >
> >
> > -Hoss
> >
> >
>



-- 
http://blog.lukas-vlcek.com/

Solrj - Exception in thread "main" java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.solr.common.util.NamedList

2008-12-18 Thread Sajith Vimukthi

Hi all,

 

I used the sample code given below and tried to run with all the relevant
jars. I receive the exception written below.

 

package test.general;

 

import org.apache.solr.client.solrj.SolrServer;

import org.apache.solr.client.solrj.SolrServerException;

import org.apache.solr.client.solrj.SolrQuery;

import org.apache.solr.client.solrj.response.UpdateResponse;

import org.apache.solr.client.solrj.response.QueryResponse;

import org.apache.solr.client.solrj.response.FacetField;

import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;

import org.apache.solr.common.SolrInputDocument;

import org.apache.solr.common.params.SolrParams;

 

 

 

import java.io.IOException;

import java.util.Collection;

import java.util.HashSet;

import java.util.Random;

import java.util.List;

 

/**

 * Connect to Solr and issue a query

 */

public class SolrJExample {

 

  public static final String [] CATEGORIES = {"a", "b", "c", "d"};

 

  public static void main(String[] args) throws IOException,
SolrServerException {

SolrServer server = new
CommonsHttpSolrServer("http://localhost:8080/solr/update";);

Random rand = new Random();



 

//Index some documents

Collection docs = new HashSet();

for (int i = 0; i < 10; i++) {

  SolrInputDocument doc = new SolrInputDocument();

  doc.addField("link", "http://non-existent-url.foo/"; + i + ".html");

  doc.addField("source", "Blog #" + i);

  doc.addField("source-link", "http://non-existent-url.foo/index.html";);

  doc.addField("subject", "Subject: " + i);

  doc.addField("title", "Title: " + i);

  doc.addField("content", "This is the " + i + "(th|nd|rd) piece of
content.");

  doc.addField("category", CATEGORIES[rand.nextInt(CATEGORIES.length)]);

  doc.addField("rating", i);

  System.out.println("Doc[" + i + "] is " + doc);

  docs.add(doc);

}



UpdateResponse response = server.add(docs);

System.out.println("Response: " + response);

//Make the documents available for search

server.commit();

//create the query

SolrQuery query = new SolrQuery("content:piece");

//indicate we want facets

query.setFacet(true);

//indicate what field to facet on

query.addFacetField("category");

//we only want facets that have at least one entry

query.setFacetMinCount(1);

//run the query

QueryResponse results = server.query(query);

System.out.println("Query Results: " + results);

//print out the facets

List facets = results.getFacetFields();

for (FacetField facet : facets) {

  System.out.println("Facet:" + facet);

}

 

 

  }

 

}

 

 

The exception :

 

Exception in thread "main" java.lang.ClassCastException: java.lang.Long
cannot be cast to org.apache.solr.common.util.NamedList

  at
org.apache.solr.common.util.NamedListCodec.unmarshal(NamedListCodec.java:89)

  at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Binar
yResponseParser.java:39)

  at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS
olrServer.java:385)

  at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS
olrServer.java:183)

  at
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.jav
a:217)

  at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)

  at test.general.SolrJExample.main(SolrJExample.java:48)

 

 

Can someone help me out.

 

Regards,

Sajith Vimukthi Weerakoon

Associate Software Engineer | ZONE24X7

| Tel: +94 11 2882390 ext 101 | Fax: +94 11 2878261 |

http://www.zone24x7.com

Solrj - Exception in thread "main" java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.solr.common.util.NamedList

2008-12-18 Thread Sajith Vimukthi

 

Hi all,

 

I used the sample code given below and tried to run with all the relevant
jars. I receive the exception written below.

 

package test.general;

 

import org.apache.solr.client.solrj.SolrServer;

import org.apache.solr.client.solrj.SolrServerException;

import org.apache.solr.client.solrj.SolrQuery;

import org.apache.solr.client.solrj.response.UpdateResponse;

import org.apache.solr.client.solrj.response.QueryResponse;

import org.apache.solr.client.solrj.response.FacetField;

import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;

import org.apache.solr.common.SolrInputDocument;

import org.apache.solr.common.params.SolrParams;

 

 

 

import java.io.IOException;

import java.util.Collection;

import java.util.HashSet;

import java.util.Random;

import java.util.List;

 

/**

 * Connect to Solr and issue a query

 */

public class SolrJExample {

 

  public static final String [] CATEGORIES = {"a", "b", "c", "d"};

 

  public static void main(String[] args) throws IOException,
SolrServerException {

SolrServer server = new
CommonsHttpSolrServer("http://localhost:8080/solr/update";);

Random rand = new Random();



 

//Index some documents

Collection docs = new HashSet();

for (int i = 0; i < 10; i++) {

  SolrInputDocument doc = new SolrInputDocument();

  doc.addField("link", "http://non-existent-url.foo/"; + i + ".html");

  doc.addField("source", "Blog #" + i);

  doc.addField("source-link", "http://non-existent-url.foo/index.html";);

  doc.addField("subject", "Subject: " + i);

  doc.addField("title", "Title: " + i);

  doc.addField("content", "This is the " + i + "(th|nd|rd) piece of
content.");

  doc.addField("category", CATEGORIES[rand.nextInt(CATEGORIES.length)]);

  doc.addField("rating", i);

  System.out.println("Doc[" + i + "] is " + doc);

  docs.add(doc);

}



UpdateResponse response = server.add(docs);

System.out.println("Response: " + response);

//Make the documents available for search

server.commit();

//create the query

SolrQuery query = new SolrQuery("content:piece");

//indicate we want facets

query.setFacet(true);

//indicate what field to facet on

query.addFacetField("category");

//we only want facets that have at least one entry

query.setFacetMinCount(1);

//run the query

QueryResponse results = server.query(query);

System.out.println("Query Results: " + results);

//print out the facets

List facets = results.getFacetFields();

for (FacetField facet : facets) {

  System.out.println("Facet:" + facet);

}

 

 

  }

 

}

 

 

The exception :

 

Exception in thread "main" java.lang.ClassCastException: java.lang.Long
cannot be cast to org.apache.solr.common.util.NamedList

  at
org.apache.solr.common.util.NamedListCodec.unmarshal(NamedListCodec.java:89)

  at
org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Binar
yResponseParser.java:39)

  at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS
olrServer.java:385)

  at
org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS
olrServer.java:183)

  at
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.jav
a:217)

  at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)

  at test.general.SolrJExample.main(SolrJExample.java:48)

 

 

Can someone help me out.

 

Regards,

Sajith Vimukthi Weerakoon

Associate Software Engineer | ZONE24X7

| Tel: +94 11 2882390 ext 101 | Fax: +94 11 2878261 |

http://www.zone24x7.com

Re: date facets doubt

2008-12-18 Thread Marc Sturlese


has anyone experienced this problem?
Can't find an explanation...

Thanks in advance


Marc Sturlese wrote:
> 
> Hey there,
> 
> 1.- I am trying to use date facets but I am facing a trouble. I want to
> use the same field to do 2 facet classification. I want to show the count
> of the docs of the last week and the counts od the docs of the last month.
> What I am doing is:
> 
> 
>   source_date
>   NOW/DAY-1MONTH
>   NOW/DAY
>   +1MONTH
> 
> 
>   source_date
>   NOW/DAY-7DAY
>   NOW/DAY
>   +7DAY
> 
> What i am getting as result is 2 facect result that are exactly the same
> (the result is just the first facet showed two times)
> 
> 
> 
> 45
> +1MONTH
> 2008-12-17T00:00:00Z
> 
> 
> 45
> +1MONTH
> 2008-12-17T00:00:00Z
> 
> 
> 
> I supose I am doing somenthing wrong in the sintax... any advice?
> Thanks in advance
> 
> 

-- 
View this message in context: 
http://www.nabble.com/date-facets-doubt-tp21050107p21069438.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solrj - Exception in thread "main" java.lang.ClassCastException: java.lang.Long cannot be cast to org.apache.solr.common.util.NamedList

2008-12-18 Thread Noble Paul നോബിള്‍ नोब्ळ्

which version of the server are you using? SolrJ documenttaion says
that the binary format works only with Solr1.3

On Thu, Dec 18, 2008 at 2:49 PM, Sajith Vimukthi  wrote:
>
>
> Hi all,
>
>
>
> I used the sample code given below and tried to run with all the relevant
> jars. I receive the exception written below.
>
>
>
> package test.general;
>
>
>
> import org.apache.solr.client.solrj.SolrServer;
>
> import org.apache.solr.client.solrj.SolrServerException;
>
> import org.apache.solr.client.solrj.SolrQuery;
>
> import org.apache.solr.client.solrj.response.UpdateResponse;
>
> import org.apache.solr.client.solrj.response.QueryResponse;
>
> import org.apache.solr.client.solrj.response.FacetField;
>
> import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
>
> import org.apache.solr.common.SolrInputDocument;
>
> import org.apache.solr.common.params.SolrParams;
>
>
>
>
>
>
>
> import java.io.IOException;
>
> import java.util.Collection;
>
> import java.util.HashSet;
>
> import java.util.Random;
>
> import java.util.List;
>
>
>
> /**
>
>  * Connect to Solr and issue a query
>
>  */
>
> public class SolrJExample {
>
>
>
>  public static final String [] CATEGORIES = {"a", "b", "c", "d"};
>
>
>
>  public static void main(String[] args) throws IOException,
> SolrServerException {
>
>SolrServer server = new
> CommonsHttpSolrServer("http://localhost:8080/solr/update";);
>
>Random rand = new Random();
>
>
>
>
>
>//Index some documents
>
>Collection docs = new HashSet();
>
>for (int i = 0; i < 10; i++) {
>
>  SolrInputDocument doc = new SolrInputDocument();
>
>  doc.addField("link", "http://non-existent-url.foo/"; + i + ".html");
>
>  doc.addField("source", "Blog #" + i);
>
>  doc.addField("source-link", "http://non-existent-url.foo/index.html";);
>
>  doc.addField("subject", "Subject: " + i);
>
>  doc.addField("title", "Title: " + i);
>
>  doc.addField("content", "This is the " + i + "(th|nd|rd) piece of
> content.");
>
>  doc.addField("category", CATEGORIES[rand.nextInt(CATEGORIES.length)]);
>
>  doc.addField("rating", i);
>
>  System.out.println("Doc[" + i + "] is " + doc);
>
>  docs.add(doc);
>
>}
>
>
>
>UpdateResponse response = server.add(docs);
>
>System.out.println("Response: " + response);
>
>//Make the documents available for search
>
>server.commit();
>
>//create the query
>
>SolrQuery query = new SolrQuery("content:piece");
>
>//indicate we want facets
>
>query.setFacet(true);
>
>//indicate what field to facet on
>
>query.addFacetField("category");
>
>//we only want facets that have at least one entry
>
>query.setFacetMinCount(1);
>
>//run the query
>
>QueryResponse results = server.query(query);
>
>System.out.println("Query Results: " + results);
>
>//print out the facets
>
>List facets = results.getFacetFields();
>
>for (FacetField facet : facets) {
>
>  System.out.println("Facet:" + facet);
>
>}
>
>
>
>
>
>  }
>
>
>
> }
>
>
>
>
>
> The exception :
>
>
>
> Exception in thread "main" java.lang.ClassCastException: java.lang.Long
> cannot be cast to org.apache.solr.common.util.NamedList
>
>  at
> org.apache.solr.common.util.NamedListCodec.unmarshal(NamedListCodec.java:89)
>
>  at
> org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(Binar
> yResponseParser.java:39)
>
>  at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS
> olrServer.java:385)
>
>  at
> org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpS
> olrServer.java:183)
>
>  at
> org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.jav
> a:217)
>
>  at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)
>
>  at test.general.SolrJExample.main(SolrJExample.java:48)
>
>
>
>
>
> Can someone help me out.
>
>
>
> Regards,
>
> Sajith Vimukthi Weerakoon
>
> Associate Software Engineer | ZONE24X7
>
> | Tel: +94 11 2882390 ext 101 | Fax: +94 11 2878261 |
>
> http://www.zone24x7.com
>
>
>
>



-- 
--Noble Paul

[SolrJ] SolrException: missing content stream

2008-12-18 Thread Gunnar Wagenknecht

Hi,

I'm using SolrJ to index a couple of documents. I do this in batches of
50 docs to safe some machine memory. I call SolrServer#add(Collection)
for each batch.

For some reason, I get the following exception:
org.apache.solr.common.SolrException: missing content stream
at
org.apache.solr.handler.XmlUpdateRequestHandler.handleRequestBody(XmlUpdateRequestHandler.java:114)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
at
org.apache.solr.client.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java:147)
at
org.apache.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java:217)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)

Any ideas what could be the issue? It actually worked fine when I added
only one doc at a time.

-Gunnar

-- 
Gunnar Wagenknecht
gun...@wagenknecht.org
http://wagenknecht.org/

Multi language search help

2008-12-18 Thread Sujatha Arun

Hi,
I am prototyping lanuage search using solr 1.3 .I  have 3 fields in the
schema -id,content and language.

I am indexing 3 pdf files ,the languages are foroyo,chinese and japanese.

I use xpdf to convert the content of pdf to text and push the text to solr
in the content field.

What is the analyzer  that i need to use for the above.

By using the default text analyzer and posting this content to solr, i am
not getting any  results.

Does solr support stemming for the above languages.

Regards
Sujatha

Re: [SolrJ] SolrException: missing content stream

2008-12-18 Thread Ryan McKinley


are you sure the Collection is not empty?
what version are you running?
what do the server logs say when you get this error on the client?

On Dec 18, 2008, at 6:42 AM, Gunnar Wagenknecht wrote:


Hi,

I'm using SolrJ to index a couple of documents. I do this in batches  
of

50 docs to safe some machine memory. I call SolrServer#add(Collection)
for each batch.

For some reason, I get the following exception:
org.apache.solr.common.SolrException: missing content stream
at
org 
.apache 
.solr 
.handler 
.XmlUpdateRequestHandler 
.handleRequestBody(XmlUpdateRequestHandler.java:114)

at
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204)
at
org 
.apache 
.solr 
.client 
.solrj.embedded.EmbeddedSolrServer.request(EmbeddedSolrServer.java: 
147)

at
org 
.apache 
.solr.client.solrj.request.UpdateRequest.process(UpdateRequest.java: 
217)

at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:48)

Any ideas what could be the issue? It actually worked fine when I  
added

only one doc at a time.

-Gunnar

--
Gunnar Wagenknecht
gun...@wagenknecht.org
http://wagenknecht.org/

Change in config file (synonym.txt) requires container restart?

2008-12-18 Thread Sagar Khetkade


Hi,
 
I am using SolrJ client to connect to the Solr 1.3 server and the whole POC 
(doing a feasibility study ) reside in Tomcat web server. If any change I am 
making in the synonym.txt file to add the synonym in the file to make it 
reflect I have to restart the tomcat server. The synonym filter factory that I 
am using are in both in analyzers for type index and query in schema.xml. 
Please tell me whether this approach is good or any other way to make the 
change reflect while searching without restarting of tomcat server.
 
Thanks and Regards,
Sagar Khetkade
_
Chose your Life Partner? Join MSN Matrimony FREE
http://in.msn.com/matrimony

Re: Change in config file (synonym.txt) requires container restart?

2008-12-18 Thread Mark Miller


Sagar Khetkade wrote:

Hi,
 
I am using SolrJ client to connect to the Solr 1.3 server and the whole POC (doing a feasibility study ) reside in Tomcat web server. If any change I am making in the synonym.txt file to add the synonym in the file to make it reflect I have to restart the tomcat server. The synonym filter factory that I am using are in both in analyzers for type index and query in schema.xml. Please tell me whether this approach is good or any other way to make the change reflect while searching without restarting of tomcat server.
 
Thanks and Regards,

Sagar Khetkade
_
Chose your Life Partner? Join MSN Matrimony FREE
http://in.msn.com/matrimony
  

You can also reload the core.

- Mark

Re: Get All terms from all documents

2008-12-18 Thread Erick Erickson

I think I'd pin the user down and have him give me the real-world
use-cases that require this, then see if there's a more reasonable
 way to satisfy that use-case. Do they want type-ahead? What
is the user of the system going to see? Because, for instance,
a drop-down of 10,000 terms is totally useless.

Best
Erick

On Wed, Dec 17, 2008 at 10:02 PM, roberto  wrote:

> Grant
>
> It completely crazy do something like this i know, but the customer want´s,
> i´m really trying to figure out how to do it in a better way, maybe using
> the (auto suggest) filter from solr 1.3 to get all the words starting with
> some letter and cache the letter in the client side, out client is going to
> be write in swing, what do you guys think?
>
> Thanks,
>
> On Wed, Dec 17, 2008 at 8:05 PM, Grant Ingersoll  >wrote:
>
> > All terms from all docs?  Really?
> >
> > At any rate, see http://wiki.apache.org/solr/TermsComponent  May need a
> > mod to not require any field, but for now you can enter all fields (which
> > you can get from LukeRequestHandler)
> >
> > -Grant
> >
> >
> >
> > On Dec 17, 2008, at 2:17 PM, roberto wrote:
> >
> > Hello,
> >>
> >> I need to get all terms from all documents to be placed in my interface
> >> almost like the facets, how can i do it?
> >>
> >> thanks
> >>
> >> --
> >> "Without love, we are birds with broken wings."
> >> Morrie
> >>
> >
> > --
> > Grant Ingersoll
> >
> > Lucene Helpful Hints:
> > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > http://wiki.apache.org/lucene-java/LuceneFAQ
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
>
>
> --
> "Without love, we are birds with broken wings."
> Morrie
>

Re: looking for multilanguage indexing best practice/hint

2008-12-18 Thread Erick Erickson

See the CJKAnalyzer for a start, StandardAnalyzer won't
help you much.

Also, tell us a little more about your requirements. For instance,
if a user submits a query in Japanese, do you want to search
across documents in the other languages too? And will you want
to associate different analyzers with the content from different
languages? You really have two options:

if you want different analyzers used with the different languages,
you probably have to index the content in different fields. That is
a Chinese document would have a chinese_content field, a Japanese
document would have a japanese_content field etc. Now you can
associate a different analyzer with each *_content field.

If the same analyzer would work for all three languages, you
can just index all the content in a "content" field, and if you
need to restrict searching to the language in which the query
was submitted, you could always add a clause on the
language, e.g. AND language:chinese

Hope this helps
Erick

On Wed, Dec 17, 2008 at 11:15 PM, Sujatha Arun  wrote:

> Hi,
>
> I am prototyping lanuage search using solr 1.3 .I  have 3 fields in the
> schema -id,content and language.
>
> I am indexing 3 pdf files ,the languages are foroyo,chinese and japanese.
>
> I use xpdf to convert the content of pdf to text and push the text to solr
> in the content field.
>
> What is the analyzer  that i need to use for the above.
>
> By using the default text analyzer and posting this content to solr, i am
> not getting any  results.
>
> Does solr support stemmin for the above languages.
>
> Regards
> Sujatha
>
>
>
>
> On 12/18/08, Feak, Todd  wrote:
> >
> > Don't forget to consider scaling concerns (if there are any). There are
> > strong differences in the number of searches we receive for each
> > language. We chose to create separate schema and config per language so
> > that we can throw servers at a particular language (or set of languages)
> > if we needed to. We see 2 orders of magnitude difference between our
> > most popular language and our least popular.
> >
> > -Todd Feak
> >
> > -Original Message-
> > From: Julian Davchev [mailto:j...@drun.net]
> > Sent: Wednesday, December 17, 2008 11:31 AM
> > To: solr-user@lucene.apache.org
> > Subject: looking for multilanguage indexing best practice/hint
> >
> > Hi,
> > From my study on solr and lucene so far it seems that I will use single
> > scheme.at least don't see scenario where I'd need more than that.
> > So question is how do I approach multilanguage indexing and multilang
> > searching. Will it really make sense for just searching word..or rather
> > I should supply lang param to search as well.
> >
> > I see there are those filters and already advised on them but I guess
> > question is more of a best practice.
> > solr.ISOLatin1AccentFilterFactory, solr.SnowballPorterFilterFactory
> >
> > So solution I see is using copyField I have same field in different
> > langs or something using distinct filter.
> > Cheers
> >
> >
> >
> >
>

Highlighting broken? String index out of range: 35

2008-12-18 Thread Steffen B.


Hi everyone,
it seems that I've run into another problem with my Solr setup. :/ The
highlighter just won't highlight anything, no matter which fragmenter or
config params I use.
Here's an example, taken straight out of the example solrconfig.xml:


 dismax
 explicit
 0.01
 
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
 
 
text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
 
 
ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3
 
 
id,name,price,score
 
 
2<-1 5<-2 6<90%
 
 100
 *:*
 
 text features name
 
 0
 
 name
 regex 

  

Whenever I try to activate the highlighter, it produces an error:
http://localhost:8983/solr/select/?q=ipod&version=2.2&start=0&rows=10&indent=on&qt=dismax&hl=true

HTTP ERROR: 500

String index out of range: 35

java.lang.StringIndexOutOfBoundsException: String index out of range: 35
at java.lang.String.substring(Unknown Source)
at
org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:239)
at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:310)
at
org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:83)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:171)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1313)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

That's what happens with the example setup - on my project it simply won't
highlight anything at all, no matter what I try. :| Can anyone shed some
light on this?
-- 
View this message in context: 
http://www.nabble.com/Highlighting-broken--String-index-out-of-range%3A-35-tp21073102p21073102.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Highlighting broken? String index out of range: 35

2008-12-18 Thread Steffen B.


Alright, I pinned it down, I think...
The cause of the error seems to be the "features" field, which has
termVectors="true", termPositions="true" and termOffsets="true". The other 2
fields ("name" and "text") work, they have the same type but lack the
term*-attributes. When you overwrite the default hl.fl with something like
"name text" it works, but add "features" to it and you get the error.


Steffen B. wrote:
> 
> Hi everyone,
> it seems that I've run into another problem with my Solr setup. :/ The
> highlighter just won't highlight anything, no matter which fragmenter or
> config params I use.
> Here's an example, taken straight out of the example solrconfig.xml:
> 
> 
>  dismax
>  explicit
>  0.01
>  
> text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
>  
>  
> text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
>  
>  
> ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3
>  
>  
> id,name,price,score
>  
>  
> 2<-1 5<-2 6<90%
>  
>  100
>  *:*
>  
>  text features name
>  
>  0
>  
>  name
>  regex 
> 
>   
> 
> Whenever I try to activate the highlighter, it produces an error:
> http://localhost:8983/solr/select/?q=ipod&version=2.2&start=0&rows=10&indent=on&qt=dismax&hl=true
> 
> HTTP ERROR: 500
> 
> String index out of range: 35
> 
> java.lang.StringIndexOutOfBoundsException: String index out of range: 35
>   at java.lang.String.substring(Unknown Source)
>   at
> org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:239)
>   at
> org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:310)
>   at
> org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:83)
>   at
> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:171)
>   at
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1313)
>   at
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
>   at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
>   at
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
>   at
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
>   at
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
>   at
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
>   at
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
>   at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
>   at
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
>   at
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>   at
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
>   at org.mortbay.jetty.Server.handle(Server.java:285)
>   at
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
>   at
> org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
>   at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
>   at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
>   at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
>   at
> org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
>   at
> org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)
> 
> That's what happens with the example setup - on my project it simply won't
> highlight anything at all, no matter what I try. :| Can anyone shed some
> light on this?
> 

-- 
View this message in context: 
http://www.nabble.com/Highlighting-broken--String-index-out-of-range%3A-35-tp21073102p21073356.html
Sent from the Solr - User mailing list archive at Nabble.com.

Solr openning many threads

2008-12-18 Thread Alexander Ramos Jardim

Hello,

I can see from a thread dump that Solr opens a lot of threads.

How does Solr use these threads? Does exist more than one thread for search
in Solr? Does Solr use any type of workManager or are the threads simple
java.lang.Thread ? How many concurrent threads does Solr create? How does it
manage them?

-- 
Alexander Ramos Jardim

Re: Highlighting broken? String index out of range: 35

2008-12-18 Thread Koji Sekiguchi


I think you are facing this problem:

https://issues.apache.org/jira/browse/SOLR-925

I'm just looking the issue to solve it, I'm not sure that I can fix it 
in my time, though...


Koji

Steffen B. wrote:

Hi everyone,
it seems that I've run into another problem with my Solr setup. :/ The
highlighter just won't highlight anything, no matter which fragmenter or
config params I use.
Here's an example, taken straight out of the example solrconfig.xml:


 dismax
 explicit
 0.01
 
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
 
 
text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9
 
 
ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3
 
 
id,name,price,score
 
 
2<-1 5<-2 6<90%
 
 100
 *:*
 
 text features name
 
 0
 
 name
 regex 

  

Whenever I try to activate the highlighter, it produces an error:
http://localhost:8983/solr/select/?q=ipod&version=2.2&start=0&rows=10&indent=on&qt=dismax&hl=true

HTTP ERROR: 500

String index out of range: 35

java.lang.StringIndexOutOfBoundsException: String index out of range: 35
at java.lang.String.substring(Unknown Source)
at
org.apache.lucene.search.highlight.Highlighter.getBestTextFragments(Highlighter.java:239)
at
org.apache.solr.highlight.DefaultSolrHighlighter.doHighlighting(DefaultSolrHighlighter.java:310)
at
org.apache.solr.handler.component.HighlightComponent.process(HighlightComponent.java:83)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:171)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1313)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365)
at
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405)
at
org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211)
at
org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
at 
org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139)
at org.mortbay.jetty.Server.handle(Server.java:285)
at 
org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502)
at
org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:821)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378)
at
org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226)
at
org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442)

That's what happens with the example setup - on my project it simply won't
highlight anything at all, no matter what I try. :| Can anyone shed some
light on this?

Problem in Date Format in Solr 1.3

2008-12-18 Thread rohit arora



Hi

I have upgraded from solr lucene 1.2 to solr lucene 1.3. I have coppied
all the "" tag of   "schema.xml" from the 
solr 1.2 to solr 1.3 it gives an error..

SEVERE: org.apache.solr.common.SolrException: Invalid Date in Date Math 
String:'2006-Oct-10T10:06:13Z'

can you help me in this problem.

with regards
   Rohit Arora

TermVectorComponent and SolrJ

2008-12-18 Thread Aleksander M. Stensby

Hello everyone, I've started to look at TermVectorComponent and I'm  
experimenting with the use of the component in a sort of "top terms"  
setting for a given query...
Was also looking at mlt and the interestingTerms, but I would like to do a  
query, get say 10k results, and from those results return a list of "top  
10 terms" or something similar...


Haven't really thought too much about it yet, but I was wondering if  
anyone have done any work on making the term vector response available in  
a simple manner with solrj yet? Or if this is planned? (In the same sense  
as it is today with facets (response.getFacetFields() etc..). Not that I  
cant manage to write it myself, but I would recon that more people than me  
would be interessted in this. I'd be more than happy to contribute if it  
is wanted, just wanted to check if anyone have started on this already or  
not.


Cheers,
 Aleks

--
Aleksander M. Stensby
Senior software developer
Integrasco A/S

Please consider the environment before printing all or any of this e-mail

Re: Solr openning many threads

2008-12-18 Thread Yonik Seeley

On Thu, Dec 18, 2008 at 9:03 AM, Alexander Ramos Jardim
 wrote:
> I can see from a thread dump that Solr opens a lot of threads.
>
> How does Solr use these threads? Does exist more than one thread for search
> in Solr? Does Solr use any type of workManager or are the threads simple
> java.lang.Thread ? How many concurrent threads does Solr create? How does it
> manage them?

Unless distributed search is being used, Solr currently has one single
thread executor for background warming.
There is a thread-per-request, but that's just the way servlet
containers work (Jetty, Tomcat, etc)
You can control the max number of threads that are created in the
servlet container config.

-Yonik

Solr and Autocompletion

2008-12-18 Thread Kashyap, Raghu

Hi,

  One of things we are looking for is to Autofill the keywords when people 
start typing. (e.g. Google autofill)

Currently we are using the RangeQuery. I read about the PrefixQuery and feel 
that it might be appropriate for this kind of implementation.

Has anyone implemented the autofill feature? If so what do you recommend?

Thanks,
Raghu

RE: looking for multilanguage indexing best practice/hint

2008-12-18 Thread Daniel Alheiros

Hi Sujatha.

I've developed a search system for 6 different languages and as it was
implemented on Solr 1.2 all those languages are part of the same index,
using different fields for each so I can have different analyzers for
each one.

Like:
content_chinese
content_english
content_russian
content_arabic

I've also defined a language field that I use to be able to separate
those on query time.

As you are going to implement it using Solr 1.3 I would rather create
one core per language and keep my schema simpler without the _language
suffix. Each schema (one per language) would have only, say, content
which depending on its language will use a proper analyzer and filters.

Having a separate core per language is also good as the scores for a
language won't be affected by the indexing of documents in other
languages.

Do you have any requirement for searching in any language, say q=test
and this term should be found in any language? If so, you may think of
distributed search to combine your results or even to take the same
approach I've taken as I couldn't use multi-core.

I'm also using the Dismax request handler, that's worth to have a look
so you can pre-define some base query parts and also do score boosting
behind the scenes.

I hope it helps.

Regards,
Daniel 

-Original Message-
From: Sujatha Arun [mailto:suja.a...@gmail.com] 
Sent: 18 December 2008 04:15
To: solr-user@lucene.apache.org
Subject: Re: looking for multilanguage indexing best practice/hint

Hi,

I am prototyping lanuage search using solr 1.3 .I  have 3 fields in the
schema -id,content and language.

I am indexing 3 pdf files ,the languages are foroyo,chinese and
japanese.

I use xpdf to convert the content of pdf to text and push the text to
solr in the content field.

What is the analyzer  that i need to use for the above.

By using the default text analyzer and posting this content to solr, i
am not getting any  results.

Does solr support stemmin for the above languages.

Regards
Sujatha

On 12/18/08, Feak, Todd  wrote:
>
> Don't forget to consider scaling concerns (if there are any). There 
> are strong differences in the number of searches we receive for each 
> language. We chose to create separate schema and config per language 
> so that we can throw servers at a particular language (or set of 
> languages) if we needed to. We see 2 orders of magnitude difference 
> between our most popular language and our least popular.
>
> -Todd Feak
>
> -Original Message-
> From: Julian Davchev [mailto:j...@drun.net]
> Sent: Wednesday, December 17, 2008 11:31 AM
> To: solr-user@lucene.apache.org
> Subject: looking for multilanguage indexing best practice/hint
>
> Hi,
> From my study on solr and lucene so far it seems that I will use 
> single scheme.at least don't see scenario where I'd need more than
that.
> So question is how do I approach multilanguage indexing and multilang 
> searching. Will it really make sense for just searching word..or 
> rather I should supply lang param to search as well.
>
> I see there are those filters and already advised on them but I guess 
> question is more of a best practice.
> solr.ISOLatin1AccentFilterFactory, solr.SnowballPorterFilterFactory
>
> So solution I see is using copyField I have same field in different 
> langs or something using distinct filter.
> Cheers
>
>
>
>

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.

Re: Solr and Autocompletion

2008-12-18 Thread Ryan McKinley


lots of options out there

Rather then doing a slow query like Prefix, i think its best to index  
the ngrams so the autocomplete is a fast query.


http://www.mail-archive.com/solr-user@lucene.apache.org/msg06776.html



On Dec 18, 2008, at 11:56 AM, Kashyap, Raghu wrote:


Hi,

 One of things we are looking for is to Autofill the keywords when  
people start typing. (e.g. Google autofill)


Currently we are using the RangeQuery. I read about the PrefixQuery  
and feel that it might be appropriate for this kind of implementation.


Has anyone implemented the autofill feature? If so what do you  
recommend?


Thanks,
Raghu

Re: looking for multilanguage indexing best practice/hint

2008-12-18 Thread Chris Hostetter


: Subject: looking for multilanguage indexing best practice/hint
: References: <49483388.8030...@drun.net>   
: <502b8706-828b-4eaa-886d-af0dccf37...@stylesight.com>
: <8c0c601f0812170825j766cf005i9546b2604a19f...@mail.gmail.com>
: In-Reply-To: <8c0c601f0812170825j766cf005i9546b2604a19f...@mail.gmail.com>

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/Thread_hijacking



-Hoss

Re: Solr and Autocompletion

2008-12-18 Thread Chris Hostetter


: Subject: Solr and Autocompletion
: References: <49483388.8030...@drun.net>
:  <502b8706-828b-4eaa-886d-af0dccf37...@stylesight.com>
:  <8c0c601f0812170825j766cf005i9546b2604a19f...@mail.gmail.com>
:  <4949537a.3050...@drun.net>
:  <8599f2e4e80ecc44aee81fa2974ce2bd0c31d...@mail-sd1.ad.soe.sony.com>
:  <414cb3700812172015y2c0481c3hc6345392d514a...@mail.gmail.com>
:  <359a92830812180538q424a0744j3be8a109cec81...@mail.gmail.com>
: In-Reply-To: <359a92830812180538q424a0744j3be8a109cec81...@mail.gmail.com>

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email.  Even if you change the
subject line of your email, other mail headers still track which thread
you replied to and your question is "hidden" in that thread and gets less
attention.   It makes following discussions in the mailing list archives
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/Thread_hijacking



-Hoss

Re: [ANNOUNCE] Solr Logo Contest Results

2008-12-18 Thread Mathijs Homminga


Good choice!

Mathijs Homminga

Chris Hostetter wrote:

(replies to solr-user please)

On behalf of the Solr Committers, I'm happy to announce that we the 
Solr Logo Contest is officially concluded. (Woot!)


And the Winner Is...
https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg 


...by Michiel

We ran into a few hiccups during the contest making it take longer 
then intended, but the result was a thorough process in which everyone 
went above and beyond to ensure that the final choice best reflected 
the wishes of the community.


You can expect to see the new logo appear on the site (and in the Solr 
app) in the next few weeks.


Congrats Michiel!


-Hoss



--
Knowlogy
Helperpark 290 C
9723 ZA Groningen
+31 (0)50 2103567
http://www.knowlogy.nl

mathijs.hommi...@knowlogy.nl
+31 (0)6 15312977

Re: Get All terms from all documents

2008-12-18 Thread roberto

Erick,

Thanks for the answer, let me clarify the thing, we would like to have a
combobox with the terms to guide the user in the search i mean, if a have
thousands of documents and want to tell them how many documents in the base
have the particular word, how can i do that?

thanks

On Thu, Dec 18, 2008 at 11:25 AM, Erick Erickson wrote:

> I think I'd pin the user down and have him give me the real-world
> use-cases that require this, then see if there's a more reasonable
>  way to satisfy that use-case. Do they want type-ahead? What
> is the user of the system going to see? Because, for instance,
> a drop-down of 10,000 terms is totally useless.
>
> Best
> Erick
>
> On Wed, Dec 17, 2008 at 10:02 PM, roberto  wrote:
>
> > Grant
> >
> > It completely crazy do something like this i know, but the customer
> want´s,
> > i´m really trying to figure out how to do it in a better way, maybe using
> > the (auto suggest) filter from solr 1.3 to get all the words starting
> with
> > some letter and cache the letter in the client side, out client is going
> to
> > be write in swing, what do you guys think?
> >
> > Thanks,
> >
> > On Wed, Dec 17, 2008 at 8:05 PM, Grant Ingersoll  > >wrote:
> >
> > > All terms from all docs?  Really?
> > >
> > > At any rate, see http://wiki.apache.org/solr/TermsComponent  May need
> a
> > > mod to not require any field, but for now you can enter all fields
> (which
> > > you can get from LukeRequestHandler)
> > >
> > > -Grant
> > >
> > >
> > >
> > > On Dec 17, 2008, at 2:17 PM, roberto wrote:
> > >
> > > Hello,
> > >>
> > >> I need to get all terms from all documents to be placed in my
> interface
> > >> almost like the facets, how can i do it?
> > >>
> > >> thanks
> > >>
> > >> --
> > >> "Without love, we are birds with broken wings."
> > >> Morrie
> > >>
> > >
> > > --
> > > Grant Ingersoll
> > >
> > > Lucene Helpful Hints:
> > > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > > http://wiki.apache.org/lucene-java/LuceneFAQ
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> >
> >
> > --
> > "Without love, we are birds with broken wings."
> > Morrie
> >
>



-- 
"Without love, we are birds with broken wings."
Morrie

Approximate release date for 1.4

2008-12-18 Thread Kay Kay

Just curious - if we have an approximate target release date for 1.4 / 
list of milestones / feature sets for the same.

Re: [ANNOUNCE] Solr Logo Contest Results

2008-12-18 Thread Jeryl Cook

looks cool :),  how about a talking mascot as

Jeryl Cook
twoenc...@gmail.com

On Thu, Dec 18, 2008 at 1:38 PM, Mathijs Homminga
 wrote:
> Good choice!
>
> Mathijs Homminga
>
> Chris Hostetter wrote:
>>
>> (replies to solr-user please)
>>
>> On behalf of the Solr Committers, I'm happy to announce that we the Solr
>> Logo Contest is officially concluded. (Woot!)
>>
>> And the Winner Is...
>>
>> https://issues.apache.org/jira/secure/attachment/12394264/apache_solr_a_red.jpg
>> ...by Michiel
>>
>> We ran into a few hiccups during the contest making it take longer then
>> intended, but the result was a thorough process in which everyone went above
>> and beyond to ensure that the final choice best reflected the wishes of the
>> community.
>>
>> You can expect to see the new logo appear on the site (and in the Solr
>> app) in the next few weeks.
>>
>> Congrats Michiel!
>>
>>
>> -Hoss
>>
>
> --
> Knowlogy
> Helperpark 290 C
> 9723 ZA Groningen
> +31 (0)50 2103567
> http://www.knowlogy.nl
>
> mathijs.hommi...@knowlogy.nl
> +31 (0)6 15312977
>
>
>



-- 
Jeryl Cook
/^\ Pharaoh /^\
http://pharaohofkush.blogspot.com/
"Whether we bring our enemies to justice, or bring justice to our
enemies, justice will be done."
--George W. Bush, Address to a Joint Session of Congress and the
American People, September 20, 2001

Re: Approximate release date for 1.4

2008-12-18 Thread Yonik Seeley

On Thu, Dec 18, 2008 at 2:43 PM, Kay Kay  wrote:
> Just curious - if we have an approximate target release date for 1.4 / list
> of milestones / feature sets for the same.

Mid January.
Issues included: case-by-case analysis of how ready they are (and
obviously affected by committers "scratching their own itch".)

-Yonik

Re: looking for multilanguage indexing best practice/hint

2008-12-18 Thread Julian Davchev

Thanks Erick,
I think I will go with different language fields as I want to give
different stop words, analyzers etc.
I might also consider scheme per language so scaling is more flexible as
I was already advised but this will really make sense if I have more
than one server I guess, else just all other data is duplicated for no
reason.
We already made decision that language will be passed each time in
search so won't make sense to search quert in any lang.

As of CJKAnalyzer from first look doesn't seem to be in solr (haven't
tried yet) and since I am noob in java will check how it's done.
Will definately give a try.

Thanks alot for help.

Erick Erickson wrote:
> See the CJKAnalyzer for a start, StandardAnalyzer won't
> help you much.
>
> Also, tell us a little more about your requirements. For instance,
> if a user submits a query in Japanese, do you want to search
> across documents in the other languages too? And will you want
> to associate different analyzers with the content from different
> languages? You really have two options:
>
> if you want different analyzers used with the different languages,
> you probably have to index the content in different fields. That is
> a Chinese document would have a chinese_content field, a Japanese
> document would have a japanese_content field etc. Now you can
> associate a different analyzer with each *_content field.
>
> If the same analyzer would work for all three languages, you
> can just index all the content in a "content" field, and if you
> need to restrict searching to the language in which the query
> was submitted, you could always add a clause on the
> language, e.g. AND language:chinese
>
> Hope this helps
> Erick
>
> On Wed, Dec 17, 2008 at 11:15 PM, Sujatha Arun  wrote:
>
>   
>> Hi,
>>
>> I am prototyping lanuage search using solr 1.3 .I  have 3 fields in the
>> schema -id,content and language.
>>
>> I am indexing 3 pdf files ,the languages are foroyo,chinese and japanese.
>>
>> I use xpdf to convert the content of pdf to text and push the text to solr
>> in the content field.
>>
>> What is the analyzer  that i need to use for the above.
>>
>> By using the default text analyzer and posting this content to solr, i am
>> not getting any  results.
>>
>> Does solr support stemmin for the above languages.
>>
>> Regards
>> Sujatha
>>
>>
>>
>>
>> On 12/18/08, Feak, Todd  wrote:
>> 
>>> Don't forget to consider scaling concerns (if there are any). There are
>>> strong differences in the number of searches we receive for each
>>> language. We chose to create separate schema and config per language so
>>> that we can throw servers at a particular language (or set of languages)
>>> if we needed to. We see 2 orders of magnitude difference between our
>>> most popular language and our least popular.
>>>
>>> -Todd Feak
>>>
>>> -Original Message-
>>> From: Julian Davchev [mailto:j...@drun.net]
>>> Sent: Wednesday, December 17, 2008 11:31 AM
>>> To: solr-user@lucene.apache.org
>>> Subject: looking for multilanguage indexing best practice/hint
>>>
>>> Hi,
>>> From my study on solr and lucene so far it seems that I will use single
>>> scheme.at least don't see scenario where I'd need more than that.
>>> So question is how do I approach multilanguage indexing and multilang
>>> searching. Will it really make sense for just searching word..or rather
>>> I should supply lang param to search as well.
>>>
>>> I see there are those filters and already advised on them but I guess
>>> question is more of a best practice.
>>> solr.ISOLatin1AccentFilterFactory, solr.SnowballPorterFilterFactory
>>>
>>> So solution I see is using copyField I have same field in different
>>> langs or something using distinct filter.
>>> Cheers
>>>
>>>
>>>
>>>
>>>   
>
>

does this break Solr? dynamicField name="*" type="ignored"

2008-12-18 Thread Peter Wolanin

I'm seeing a weird effect with a '*' field.  In the example
schema.xml, there is a commented out sample:

   
   

We have this un-commented, and in the schema browser via the admin
interface I see that all non-dynamic fields get a type of "ignored".

I see this in the Solr admin interface:

Field: uid
Dynamically Created From Pattern: *
Field Type: ignored

though the field definition is:

   

Is this a bug in the admin interface, or a problem with using this '*'
in the schema?

Thanks,

Peter

-- 
--
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com

Re: does this break Solr? dynamicField name="*" type="ignored"

2008-12-18 Thread Yonik Seeley

Looks like it's a bug in the schema browser (i.e. just this display,
no the inner workings of Solr).
Could you open a JIRA issue for this?

-Yonik


On Thu, Dec 18, 2008 at 3:20 PM, Peter Wolanin  wrote:
> I'm seeing a weird effect with a '*' field.  In the example
> schema.xml, there is a commented out sample:
>
>   
>   
>
> We have this un-commented, and in the schema browser via the admin
> interface I see that all non-dynamic fields get a type of "ignored".
>
> I see this in the Solr admin interface:
>
> Field: uid
> Dynamically Created From Pattern: *
> Field Type: ignored
>
> though the field definition is:
>
>   
>
> Is this a bug in the admin interface, or a problem with using this '*'
> in the schema?
>
> Thanks,
>
> Peter
>
> --
> --
> Peter M. Wolanin, Ph.D.
> Momentum Specialist,  Acquia. Inc.
> peter.wola...@acquia.com
>

Re: Partitioning the index

2008-12-18 Thread Yonik Seeley

It's more related to how much memory you have on your boxes, how
resource intensive your queries are, how many fields you are trying to
facet on, what acceptable response times are, etc.

Anyway... a single box is normally good for between 5M and 50M docs,
but can fall out of that range (both up and down) depending on the
specifics.

-Yonik

On Wed, Dec 17, 2008 at 9:34 PM, s d  wrote:
> Hi,Is there a recommended index size (on disk, number of documents) for when
> to start partitioning it to ensure good response time?
> Thanks,
> S
>

Re: Get All terms from all documents

2008-12-18 Thread Erick Erickson

How do you get the word in the first place? If the combobox
is for all words in your index, it's probably completely useless
to provide this information because there is too much data to
guide the user at all. I mean a list of 10,000 words with some sort
of document frequency seems to me to require significant
developer work without adding to the user experience at all...

If that's the case, I'd really work with your customer and try
to persuade them that this is a feature that adds little value,
and that there are higher-value features you should do first.

But if you really, really require the information, here's what I
would recommend:

Use TermDocs/TermEnum to traverse your index gathering
this data *at index time*. Then create a *very special* document
that you also put in your index (stored, but not indexed
in this case) that contains an unique field (say frequencies).

Upon startup of your searcher, read in this very special document,
parse it and create a map of words and frequencies that you use
to find the number of documents containing that word.

Hope this helps
Erick

On Thu, Dec 18, 2008 at 1:53 PM, roberto  wrote:

> Erick,
>
> Thanks for the answer, let me clarify the thing, we would like to have a
> combobox with the terms to guide the user in the search i mean, if a have
> thousands of documents and want to tell them how many documents in the base
> have the particular word, how can i do that?
>
> thanks
>
> On Thu, Dec 18, 2008 at 11:25 AM, Erick Erickson  >wrote:
>
> > I think I'd pin the user down and have him give me the real-world
> > use-cases that require this, then see if there's a more reasonable
> >  way to satisfy that use-case. Do they want type-ahead? What
> > is the user of the system going to see? Because, for instance,
> > a drop-down of 10,000 terms is totally useless.
> >
> > Best
> > Erick
> >
> > On Wed, Dec 17, 2008 at 10:02 PM, roberto  wrote:
> >
> > > Grant
> > >
> > > It completely crazy do something like this i know, but the customer
> > want´s,
> > > i´m really trying to figure out how to do it in a better way, maybe
> using
> > > the (auto suggest) filter from solr 1.3 to get all the words starting
> > with
> > > some letter and cache the letter in the client side, out client is
> going
> > to
> > > be write in swing, what do you guys think?
> > >
> > > Thanks,
> > >
> > > On Wed, Dec 17, 2008 at 8:05 PM, Grant Ingersoll  > > >wrote:
> > >
> > > > All terms from all docs?  Really?
> > > >
> > > > At any rate, see http://wiki.apache.org/solr/TermsComponent  May
> need
> > a
> > > > mod to not require any field, but for now you can enter all fields
> > (which
> > > > you can get from LukeRequestHandler)
> > > >
> > > > -Grant
> > > >
> > > >
> > > >
> > > > On Dec 17, 2008, at 2:17 PM, roberto wrote:
> > > >
> > > > Hello,
> > > >>
> > > >> I need to get all terms from all documents to be placed in my
> > interface
> > > >> almost like the facets, how can i do it?
> > > >>
> > > >> thanks
> > > >>
> > > >> --
> > > >> "Without love, we are birds with broken wings."
> > > >> Morrie
> > > >>
> > > >
> > > > --
> > > > Grant Ingersoll
> > > >
> > > > Lucene Helpful Hints:
> > > > http://wiki.apache.org/lucene-java/BasicsOfPerformance
> > > > http://wiki.apache.org/lucene-java/LuceneFAQ
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > >
> > >
> > > --
> > > "Without love, we are birds with broken wings."
> > > Morrie
> > >
> >
>
>
>
> --
> "Without love, we are birds with broken wings."
> Morrie
>

Data Import Request Handler problem: Odd performance behaviour for large number of records

2008-12-18 Thread Glen Newton

Hello,

I amusing Solr 1.4 (solr-2008-11-19) with Lucene 2.4 dropped in instead of 2.9

I am indexing 500k records using the JDBC Data Import Request Handler.

Config:
 Linux openSUSE 10.2 (X86-64)
 Dual core dual core 64bit Xeon 3GHz Dell blade  8GB RAM
 java version "1.6.0_07"
 Java(TM) SE Runtime Environment (build 1.6.0_07-b06)
 Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode)
 1GB heap for Tomcat
 DB: MySql on separate but similar server

I am finding that the when I do a Full-Import, followed by another
Full-import the import takes much longer the second and subsequent
times:
Run1 = 0:27:31.491
Run2 = 1:14:44:821
Run3 = 1:14:48.316
Run4 = 2:15:12.296
Run5 = 1:37:6.847

(I have run this ~10 times and got roughly the same results). I have
also monitored the load on the Solr machine and the databases machine
for any other activity that might impact.

The final Lucene index size is 923MB. The default clean = 'true', so
the index is cleared (emptied) each time, so I am concerned the second
run takes 4 times the time of the first run.

Am I doing something wrong here? Any help would be appreciated.

I have append my data-config.xml

thanks,

Glen





















-- 

-

Re: does this break Solr? dynamicField name="*" type="ignored"

2008-12-18 Thread Peter Wolanin

created issue:  https://issues.apache.org/jira/browse/SOLR-929

-Peter

On Thu, Dec 18, 2008 at 3:32 PM, Yonik Seeley  wrote:
> Looks like it's a bug in the schema browser (i.e. just this display,
> no the inner workings of Solr).
> Could you open a JIRA issue for this?
>
> -Yonik
>
>
> On Thu, Dec 18, 2008 at 3:20 PM, Peter Wolanin  
> wrote:
>> I'm seeing a weird effect with a '*' field.  In the example
>> schema.xml, there is a commented out sample:
>>
>>   
>>   
>>
>> We have this un-commented, and in the schema browser via the admin
>> interface I see that all non-dynamic fields get a type of "ignored".
>>
>> I see this in the Solr admin interface:
>>
>> Field: uid
>> Dynamically Created From Pattern: *
>> Field Type: ignored
>>
>> though the field definition is:
>>
>>   
>>
>> Is this a bug in the admin interface, or a problem with using this '*'
>> in the schema?
>>
>> Thanks,
>>
>> Peter
>>
>> --
>> --
>> Peter M. Wolanin, Ph.D.
>> Momentum Specialist,  Acquia. Inc.
>> peter.wola...@acquia.com
>>
>



-- 
--
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com

Full reindex needed if termVectors added to fields in schema?

2008-12-18 Thread Eric Kilby


hi,

I've successfully added fields to my schema.xml before, and been able to
incrementally keep indexing documents with just the new ones picking up the
fields.  This appears to be similar to the case of not including certain
fields in certain documents, as the other documents simply don't have them
until they're added.

I'm looking into testing a MoreLikeThis implementation, and have read on
here that termVectors are needed to make it run acceptably.  I'd like to
rebuild my index, but that will take some time given the number of documents
involved, and I'd like to keep incremental updates running at the same time. 
The constraint is on the database side not the SOLR indexing side, so
improvements to indexing performance aren't my main concern here.  

So, my question is whether adding termVectors="true" to a couple of schema
fields will work similarly to adding new fields, where the updated documents
will get the vectors added and the others won't get them but will continue
to work, allowing me to rebuild "in the background" while not breaking
anything in my existing incremental update/release cycle.

I appreciate your help.

Eric Kilby

-- 
View this message in context: 
http://www.nabble.com/Full-reindex-needed-if-termVectors-added-to-fields-in-schema--tp21081315p21081315.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Get All terms from all documents

2008-12-18 Thread Mike Klaas



On 18-Dec-08, at 10:53 AM, roberto wrote:


Erick,

Thanks for the answer, let me clarify the thing, we would like to  
have a
combobox with the terms to guide the user in the search i mean, if a  
have
thousands of documents and want to tell them how many documents in  
the base

have the particular word, how can i do that?


Sounds like you want query autocomplete.  The best way to do this  
(including if you want the box filled with some queries), is to use  
the query logs, not the documents.


-Mike

RE: Change in config file (synonym.txt) requires container restart?

2008-12-18 Thread Sagar Khetkade


But i am using CommonsHttpSolrServer for Solr server configuation as it is 
accepts the url. So here how can i reload the core.
 
-Sagar> Date: Thu, 18 Dec 2008 07:55:02 -0500> From: markrmil...@gmail.com> To: 
solr-user@lucene.apache.org> Subject: Re: Change in config file (synonym.txt) 
requires container restart?> > Sagar Khetkade wrote:> > Hi,> > > > I am using 
SolrJ client to connect to the Solr 1.3 server and the whole POC (doing a 
feasibility study ) reside in Tomcat web server. If any change I am making in 
the synonym.txt file to add the synonym in the file to make it reflect I have 
to restart the tomcat server. The synonym filter factory that I am using are in 
both in analyzers for type index and query in schema.xml. Please tell me 
whether this approach is good or any other way to make the change reflect while 
searching without restarting of tomcat server.> > > > Thanks and Regards,> > 
Sagar Khetkade> > 
_> > Chose your 
Life Partner? Join MSN Matrimony FREE> > http://in.msn.com/matrimony> > > You 
can also reload the core.> > - Mark
_
Chose your Life Partner? Join MSN Matrimony FREE
http://in.msn.com/matrimony

Re: TermVectorComponent and SolrJ

2008-12-18 Thread Grant Ingersoll



On Dec 18, 2008, at 10:06 AM, Aleksander M. Stensby wrote:

Hello everyone, I've started to look at TermVectorComponent and I'm  
experimenting with the use of the component in a sort of "top terms"  
setting for a given query...
Was also looking at mlt and the interestingTerms, but I would like  
to do a query, get say 10k results, and from those results return a  
list of "top 10 terms" or something similar...


Haven't really thought too much about it yet, but I was wondering if  
anyone have done any work on making the term vector response  
available in a simple manner with solrj yet? Or if this is planned?  
(In the same sense as it is today with facets  
(response.getFacetFields() etc..). Not that I cant manage to write  
it myself, but I would recon that more people than me would be  
interessted in this. I'd be more than happy to contribute if it is  
wanted, just wanted to check if anyone have started on this already  
or not.




I think this would be a welcome contribution.

-Grant

Re: Multi language search help

2008-12-18 Thread Grant Ingersoll



On Dec 18, 2008, at 6:25 AM, Sujatha Arun wrote:


Hi,
I am prototyping lanuage search using solr 1.3 .I  have 3 fields in  
the

schema -id,content and language.

I am indexing 3 pdf files ,the languages are foroyo,chinese and  
japanese.


I use xpdf to convert the content of pdf to text and push the text  
to solr

in the content field.

What is the analyzer  that i need to use for the above.

By using the default text analyzer and posting this content to solr,  
i am

not getting any  results.

Does solr support stemming for the above languages.


I'm not familiar with Foroyo, but there should be tokenizers/analysis  
available for Chines and Japanese.  Are you putting all three  
languages into the same field?  If that is the case, you will need  
some type of language detection piece that can choose the correct  
analyzer.


How are your users searching?  That is, do you know the language they  
want to search in?  If so, then you can have a field for each language.


-Grant

Re: Data Import Request Handler problem: Odd performance behaviour for large number of records

2008-12-18 Thread Noble Paul നോബിള്‍ नोब्ळ्

DIH does not maintain any state between two runs. So if there is a
perf degradation
it could be because
- Solr Indexing is taking longer after you do a delete *:*
- Your RAM is insufficient (your machine is swapping)

On Fri, Dec 19, 2008 at 2:51 AM, Glen Newton  wrote:
> Hello,
>
> I amusing Solr 1.4 (solr-2008-11-19) with Lucene 2.4 dropped in instead of 2.9
>
> I am indexing 500k records using the JDBC Data Import Request Handler.
>
> Config:
>  Linux openSUSE 10.2 (X86-64)
>  Dual core dual core 64bit Xeon 3GHz Dell blade  8GB RAM
>  java version "1.6.0_07"
>  Java(TM) SE Runtime Environment (build 1.6.0_07-b06)
>  Java HotSpot(TM) 64-Bit Server VM (build 10.0-b23, mixed mode)
>  1GB heap for Tomcat
>  DB: MySql on separate but similar server
>
> I am finding that the when I do a Full-Import, followed by another
> Full-import the import takes much longer the second and subsequent
> times:
> Run1 = 0:27:31.491
> Run2 = 1:14:44:821
> Run3 = 1:14:48.316
> Run4 = 2:15:12.296
> Run5 = 1:37:6.847
>
> (I have run this ~10 times and got roughly the same results). I have
> also monitored the load on the Solr machine and the databases machine
> for any other activity that might impact.
>
> The final Lucene index size is 923MB. The default clean = 'true', so
> the index is cleared (emptied) each time, so I am concerned the second
> run takes 4 times the time of the first run.
>
> Am I doing something wrong here? Any help would be appreciated.
>
> I have append my data-config.xml
>
> thanks,
>
> Glen
>
> 
>  url="jdbc:mysql://blue01/dartejos" user="USER" password="PASSWD"/>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> 
>
> --
>
> -
>



-- 
--Noble Paul

Re: Change in config file (synonym.txt) requires container restart?

2008-12-18 Thread Shalin Shekhar Mangar

Please note that a core reload will also stop Solr from serving any search
requests in the time it reloads.

On Fri, Dec 19, 2008 at 8:24 AM, Sagar Khetkade
wrote:

>
> But i am using CommonsHttpSolrServer for Solr server configuation as it is
> accepts the url. So here how can i reload the core.
>
> -Sagar> Date: Thu, 18 Dec 2008 07:55:02 -0500> From: markrmil...@gmail.com>
> To: solr-user@lucene.apache.org> Subject: Re: Change in config file
> (synonym.txt) requires container restart?> > Sagar Khetkade wrote:> > Hi,> >
> > > I am using SolrJ client to connect to the Solr 1.3 server and the whole
> POC (doing a feasibility study ) reside in Tomcat web server. If any change
> I am making in the synonym.txt file to add the synonym in the file to make
> it reflect I have to restart the tomcat server. The synonym filter factory
> that I am using are in both in analyzers for type index and query in
> schema.xml. Please tell me whether this approach is good or any other way to
> make the change reflect while searching without restarting of tomcat
> server.> > > > Thanks and Regards,> > Sagar Khetkade> >
> _> > Chose
> your Life Partner? Join MSN Matrimony FREE> > http://in.msn.com/matrimony>
> > > You can also reload the core.> > - Mark
> _
> Chose your Life Partner? Join MSN Matrimony FREE
> http://in.msn.com/matrimony
>



-- 
Regards,
Shalin Shekhar Mangar.

Re: Precisions on solr.xml about cross context forwarding.

2008-12-18 Thread Chris Hostetter


: This bothers me too.  I find it really strange that Solr's entry-point 
: is a servlet filter instead of a servlet.

it traces back to the need for it to decide when to handle a request and 
when to let it pass through (to a later filter, a servlet or a JSP)

this is the only way legacy support for the /select and /update urls work 
without forcing people to modify the web.xml; it's how a handler can be 
registered with the name /admin/foo even though /admin/ resolves to a JSP 
(and without forcing people to modify the web.xml); and it's what allows 
us to use the same core path prefixes for both handler requests and the 
Admin JSPs.

:  "It is unnecessary, and potentially problematic, to have the 
SolrDispatchFilter
:   configured to also filter on forwards.  Do not configure
:   this dispatcher as FORWARD."
: 
: The problem is that if filters do not have this FORWARD thing, then
: cross context forwarding doesn't work.
: 
: Is there a workaround to this problem ?

You can try adding the FORWARD option, but the risk is that 
SolrRequestFilter could wind up forwarding to itself infinitely on some 
requests (depending on your configuration)...

http://www.nabble.com/Re%3A-svn-commit%3A-r640449lucene-solr-trunk-src-webapp-src-org-apache-solr-servlet-SolrDispatchFilter.java-p16262766.html



-Hoss

Fwd: Distributed Searching - Limitations?

2008-12-18 Thread Pooja Verlani

Hi,
I am planning to use Solr's distributed searching for my project. But while
going through http://wiki.apache.org/solr/DistributedSearch, i found a few
limitations with it. Can anyone please explain the 2nd and 3rd points in the
limitations sections on the page. The points are:

   -

   When duplicate doc IDs are received, Solr chooses the first doc and
   discards subsequent ones
   -

   No distributed idf

Thanks.
Regards,
Pooja

46 matches

Mail list logo