Re: highlight exception

2007-02-20 Thread nick19701
Mike Klaas wrote: > > nick, > > It looks as though there is a bug in the synonym filter. Since you > are using Solr's example synonym list, perhaps it would be sufficient > to remove that from your analyzer chain (schema.xml)? At least that > would prevent crashes until the bug is fixed. > >

Re: highlight search keywords on html page

2007-02-20 Thread nick19701
Chris Hostetter wrote: > > i'm not really sure that Solr can help you in this case ... it only know > about the data you give it -- if you want it to highlight the raw html of > hte entire page, then you're going to need to store the raw html of hte > entire page in the index. > > you can still

Re: solr performance

2007-02-20 Thread Erik Hatcher
You could build your index using Lucene directly and then point a Solr instance at it once its built. My suspicion is that the overhead of forming a document as an XML string and posting to Solr via HTTP won't be that much different than indexing with Lucene directly. My largest Solr ind

AW: solr performance

2007-02-20 Thread Burkamp, Christian
I do agree. There's probably no need to go to the index directly. My current solr test server has more than 5M documents and a size of about 60GB. I still index at 13 docs per second and this still includes filtering of the documents. (If you have your content ready in XML format performance will

Multiple entries in a field

2007-02-20 Thread Stefano Nicolai
Hi all. I'm trying Solr for the first time, and i find it simply amazing, cheers to the developers! While i was writing down the XML creator file, i came up with a question: i work for an online shop, and we sell books. This particular product can have multiple authors / translators in its "a

Re: Multiple entries in a field

2007-02-20 Thread Yonik Seeley
On 2/20/07, Stefano Nicolai <[EMAIL PROTECTED]> wrote: I'm worried that simply pushing into this field the infos related to the authors (i.e. "Stephen Rich" and "Karl King") it could give bad answers to my queries (i.e. returning "Stephen King" as a positive match), basically mixing up the words

Re: AW: solr performance

2007-02-20 Thread Walter Underwood
Indexing rates depend heavily on document size (text) and pre-indexing processing. Other things probably matter, too, like number of fields. My application is indexing 20X faster than Christian's, because I have small documents (a few hundred bytes) that are extracted from an RDBMS and submitted i

Re[2]: solr performance

2007-02-20 Thread Jack L
Thanks to all who replied. It's encouraging :) The numbers vary quite a bit though, from 13 docs/s (Burkamp) to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend on the doc size and hardware. I have a question for Erik: you mentioned "single threaded indexer" (below). I'm no

Re[2]: solr performance

2007-02-20 Thread Chris Hostetter
: The numbers vary quite a bit though, from 13 docs/s (Burkamp) : to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend : on the doc size and hardware. It also depends a lot on how much analysis you do of each field ... and that doesn't even begin to get totheissue of what ki

retrieve document boost

2007-02-20 Thread Ryan McKinley
I'm trying to make sure the document boost i'm sending in is actually getting used. I don't see it showing up anywhere. To illustrate, I augmented the BasicFunctionalityTest.testDocBoost() with: public void testDocBoost() throws Exception { ... LocalSolrQueryRequest lqr = lrf.makeRequest(

Re: Re[2]: solr performance

2007-02-20 Thread Walter Underwood
Try running your submits while watching a CPU load meter. Do this on a multi-CPU machine. If all CPUs are busy, you are running as fast as possible. If one CPU is busy (around 50% usage on a dual-CPU system), parallel submits might help. If no CPU is 100% busy, the bottleneck is probably disk or

Re: retrieve document boost

2007-02-20 Thread Chris Hostetter
: Can you get the boost of an indexed document? Am I missing something : basic? Is the stored document boost lost once it is indexed? Bingo. In Lucene, Document boosts aren't stored in the docs for later recovered - the getBoost method is meaningless from a Document returned by a search (or re

Re: retrieve document boost

2007-02-20 Thread Brian Whitman
On Feb 20, 2007, at 2:59 PM, Chris Hostetter wrote: In Lucene, Document boosts aren't stored in the docs for later recovered - the getBoost method is meaningless from a Document returned by a search (or retrieved from an IndexReader) Boosts are folded into the fieldNorm - doc boosts are fold

Re: retrieve document boost

2007-02-20 Thread Ryan McKinley
On 2/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote: : Can you get the boost of an indexed document? Am I missing something : basic? Is the stored document boost lost once it is indexed? Bingo. In Lucene, Document boosts aren't stored in the docs for later recovered - the getBoost method i

Re: Re[2]: solr performance

2007-02-20 Thread Erik Hatcher
On Feb 20, 2007, at 1:46 PM, Jack L wrote: The numbers vary quite a bit though, from 13 docs/s (Burkamp) to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend on the doc size and hardware. my number 1000 was per minute, not second! however, i've done a few runs toda

Re: retrieve document boost

2007-02-20 Thread Mike Klaas
On 2/20/07, Brian Whitman <[EMAIL PROTECTED]> wrote: On Feb 20, 2007, at 2:59 PM, Chris Hostetter wrote: > In Lucene, Document boosts aren't stored in the docs for later > recovered - the getBoost method is meaningless from a Document > returned by > a search (or retrieved from an IndexReader) >

Re: retrieve document boost

2007-02-20 Thread Brian Whitman
If omitNorms=false on a field, the document boost and field boost are moot. Unfortunate, but true. hm? The reason I have omitNorms=false on for a field is because I want the boost. I did so because I read here: http://wiki.apache.org/solr/UpdateXmlMessages "NOTE: make sure norms are enabled

Re: retrieve document boost

2007-02-20 Thread Mike Klaas
On 2/20/07, Brian Whitman <[EMAIL PROTECTED]> wrote: > If omitNorms=false on a field, the document boost and field boost are > moot. Unfortunate, but true. hm? The reason I have omitNorms=false on for a field is because I want the boost. I did so because I read here: http://wiki.apache.org/solr

Re: retrieve document boost

2007-02-20 Thread Brian Whitman
On Feb 20, 2007, at 4:09 PM, Mike Klaas wrote: Sorry, I inverted the logic in my head. You're right. You're not alone, I do it all the time too. I believe that docBoost does not translate into a fieldBoost display (which is only the query-time boost), but is factored into the fieldNorm. O

Re: retrieve document boost

2007-02-20 Thread Ken Krugler
: Can you get the boost of an indexed document? Am I missing something : basic? Is the stored document boost lost once it is indexed? Bingo. In Lucene, Document boosts aren't stored in the docs for later recovered - the getBoost method is meaningless from a Document returned by a search (or re