On Mon, Sep 13, 2010 at 6:29 PM, Burton-West, Tom wrote:
> Thanks Robert and everyone!
>
> I'm working on changing our JVM settings today, since putting Solr 1.4.1 into
> production will take a bit more work and testing. Hopefully, I'll be able to
> test the setTermIndexDivisor on our test serv
o see if we can provide you with our tii/tis
data. I'll let you know as soon as I hear anything.
Tom
-Original Message-
From: Robert Muir [mailto:rcm...@gmail.com]
Sent: Sunday, September 12, 2010 10:48 AM
To: solr-user@lucene.apache.org; simon.willna...@gmail.com
Subject: Re: Solr
On Sun, Sep 12, 2010 at 9:57 AM, Simon Willnauer <
simon.willna...@googlemail.com> wrote:
> > To change the divisor in your solrconfig, for example to 4, it looks like
> > you need to do this.
> >
> > > class="org.apache.solr.core.StandardIndexReaderFactory">
> >4
> >
>
> Ah, thanks robert
On Sun, Sep 12, 2010 at 12:42 PM, Robert Muir wrote:
> On Sat, Sep 11, 2010 at 7:51 PM, Michael McCandless <
> luc...@mikemccandless.com> wrote:
>
>> On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom
>> wrote:
>> > Is there an example of how to set up the divisor parameter in
>> solrconfig.xml
On Sat, Sep 11, 2010 at 7:51 PM, Michael McCandless <
luc...@mikemccandless.com> wrote:
> On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom
> wrote:
> > Is there an example of how to set up the divisor parameter in
> solrconfig.xml somewhere?
>
> Alas I don't know how to configure terms index d
One thing that the Codec API makes possible ("in theory", anyway)...
is variable gap terms index.
Ie, Lucene today makes an indexed term at regular (every N -- 128 in
3.x, 32 in 4.0) intervals.
But this is rather silly. Imagine the terms you are going through are
all singletons (happen only in o
On Sun, Sep 12, 2010 at 1:51 AM, Michael McCandless
wrote:
> On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom wrote:
>> Is there an example of how to set up the divisor parameter in
>> solrconfig.xml somewhere?
>
> Alas I don't know how to configure terms index divisor from Solr...
You can s
On Sat, Sep 11, 2010 at 11:07 AM, Burton-West, Tom wrote:
> Is there an example of how to set up the divisor parameter in solrconfig.xml
> somewhere?
Alas I don't know how to configure terms index divisor from Solr...
>>>In 4.0, w/ flex indexing, the RAM efficiency is much better -- we use lar
There is a trick: facets with only one occurrence tend to be mispellings
or dirt. You write a program to fetch the terms (Lucene's CheckIndex is
a great starting point) create a stopwords file.
Here's a data mining project: which languages are more vulnerable to
dirty OCR?
Burton-West, Tom w
Thanks Mike,
>>Do you use a terms index divisor? Setting that to 2 would halve the
>>amount of RAM required but double (on average) the seek time to locate
>>a given term (but, depending on your queries, that seek time may still
>>be a negligible part of overall query time, ie the tradeoff could
Unfortunately, the terms index (before 4.0) is not RAM efficient -- I
wrote about this here:
http://chbits.blogspot.com/2010/07/lucenes-ram-usage-for-searching.html
Every indexed term that's loaded into RAM creates 4 objects (TermInfo,
Term, String, char[]), as you see in your profiler output
11 matches
Mail list logo