Mike Klaas wrote:
>
> nick,
>
> It looks as though there is a bug in the synonym filter. Since you
> are using Solr's example synonym list, perhaps it would be sufficient
> to remove that from your analyzer chain (schema.xml)? At least that
> would prevent crashes until the bug is fixed.
>
>
Chris Hostetter wrote:
>
> i'm not really sure that Solr can help you in this case ... it only know
> about the data you give it -- if you want it to highlight the raw html of
> hte entire page, then you're going to need to store the raw html of hte
> entire page in the index.
>
> you can still
You could build your index using Lucene directly and then point a
Solr instance at it once its built. My suspicion is that the
overhead of forming a document as an XML string and posting to Solr
via HTTP won't be that much different than indexing with Lucene
directly.
My largest Solr ind
I do agree. There's probably no need to go to the index directly.
My current solr test server has more than 5M documents and a size of about 60GB.
I still index at 13 docs per second and this still includes filtering of the
documents.
(If you have your content ready in XML format performance will
Hi all.
I'm trying Solr for the first time, and i find it simply amazing, cheers
to the developers!
While i was writing down the XML creator file, i came up with a
question: i work for an online shop, and we sell books.
This particular product can have multiple authors / translators in its
"a
On 2/20/07, Stefano Nicolai <[EMAIL PROTECTED]> wrote:
I'm worried that simply pushing into this field the infos related to the
authors (i.e. "Stephen Rich" and "Karl King") it could give bad answers
to my queries (i.e. returning "Stephen King" as a positive match),
basically mixing up the words
Indexing rates depend heavily on document size (text) and pre-indexing
processing. Other things probably matter, too, like number of fields.
My application is indexing 20X faster than Christian's, because I have
small documents (a few hundred bytes) that are extracted from an RDBMS
and submitted i
Thanks to all who replied. It's encouraging :)
The numbers vary quite a bit though, from 13 docs/s (Burkamp)
to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend
on the doc size and hardware.
I have a question for Erik: you mentioned "single threaded indexer"
(below). I'm no
: The numbers vary quite a bit though, from 13 docs/s (Burkamp)
: to 250 docs/s (Walter) to 1000 docs/s I understand the results also depend
: on the doc size and hardware.
It also depends a lot on how much analysis you do of each field ... and
that doesn't even begin to get totheissue of what ki
I'm trying to make sure the document boost i'm sending in is actually
getting used. I don't see it showing up anywhere. To illustrate, I
augmented the BasicFunctionalityTest.testDocBoost() with:
public void testDocBoost() throws Exception {
...
LocalSolrQueryRequest lqr = lrf.makeRequest(
Try running your submits while watching a CPU load meter.
Do this on a multi-CPU machine.
If all CPUs are busy, you are running as fast as possible.
If one CPU is busy (around 50% usage on a dual-CPU system),
parallel submits might help.
If no CPU is 100% busy, the bottleneck is probably disk
or
: Can you get the boost of an indexed document? Am I missing something
: basic? Is the stored document boost lost once it is indexed?
Bingo. In Lucene, Document boosts aren't stored in the docs for later
recovered - the getBoost method is meaningless from a Document returned by
a search (or re
On Feb 20, 2007, at 2:59 PM, Chris Hostetter wrote:
In Lucene, Document boosts aren't stored in the docs for later
recovered - the getBoost method is meaningless from a Document
returned by
a search (or retrieved from an IndexReader)
Boosts are folded into the fieldNorm - doc boosts are fold
On 2/20/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:
: Can you get the boost of an indexed document? Am I missing something
: basic? Is the stored document boost lost once it is indexed?
Bingo. In Lucene, Document boosts aren't stored in the docs for later
recovered - the getBoost method i
On Feb 20, 2007, at 1:46 PM, Jack L wrote:
The numbers vary quite a bit though, from 13 docs/s (Burkamp)
to 250 docs/s (Walter) to 1000 docs/s I understand the results also
depend
on the doc size and hardware.
my number 1000 was per minute, not second! however, i've done a few
runs toda
On 2/20/07, Brian Whitman <[EMAIL PROTECTED]> wrote:
On Feb 20, 2007, at 2:59 PM, Chris Hostetter wrote:
> In Lucene, Document boosts aren't stored in the docs for later
> recovered - the getBoost method is meaningless from a Document
> returned by
> a search (or retrieved from an IndexReader)
>
If omitNorms=false on a field, the document boost and field boost are
moot. Unfortunate, but true.
hm? The reason I have omitNorms=false on for a field is because I
want the boost. I did so because I read here:
http://wiki.apache.org/solr/UpdateXmlMessages
"NOTE: make sure norms are enabled
On 2/20/07, Brian Whitman <[EMAIL PROTECTED]> wrote:
> If omitNorms=false on a field, the document boost and field boost are
> moot. Unfortunate, but true.
hm? The reason I have omitNorms=false on for a field is because I
want the boost. I did so because I read here:
http://wiki.apache.org/solr
On Feb 20, 2007, at 4:09 PM, Mike Klaas wrote:
Sorry, I inverted the logic in my head. You're right.
You're not alone, I do it all the time too.
I believe that docBoost does not translate into a fieldBoost display
(which is only the query-time boost), but is factored into the
fieldNorm.
O
: Can you get the boost of an indexed document? Am I missing something
: basic? Is the stored document boost lost once it is indexed?
Bingo. In Lucene, Document boosts aren't stored in the docs for later
recovered - the getBoost method is meaningless from a Document returned by
a search (or re
20 matches
Mail list logo