y after an
OOM exception. I upped the Xmx to 2GB and commits are happening much better
- in the 1 minute range.
Jim
Jim Murphy wrote:
>
> Thanks Jerome,
>
>
> 1. I have shut off autowarming by setting params to 0.
> 2. My JVM Settings: -Xmx1200m -Xms1200m -
est strategy if you've got lot of
> concurrent processes who posts.
>
> Cheers.
>
> Jerome.
>
> 2009/10/28 Jim Murphy :
>>
>> Hi All,
>>
>> We have 8 solr shards, index is ~ 90M documents 190GB. :)
>>
>> 4 of the shards have acceptable c
Hi All,
We have 8 solr shards, index is ~ 90M documents 190GB. :)
4 of the shards have acceptable commit time - 30-60 seconds. The other 4
have drifted over the last couple months to but up around 2-3 minutes. This
is killing our write throughput as you can imagine.
I've included a log dump
Yonik Seeley-2 wrote:
>
> ...your code snippit elided and edited below ...
>
Don't take this code as correct (or even compiling) but is this the essence?
I moved shared access to the writer inside the read lock and kept the other
non-commit bits to the write lock. I'd need to rethink the
gt; On Thu, May 7, 2009 at 8:37 PM, Jim Murphy wrote:
>> Interesting. So is there a JIRA ticket open for this already? Any chance
>> of
>> getting it into 1.4?
>
> No ticket currently open, but IMO it could make it for 1.4.
>
>> Its seriously kicking out butts r
Interesting. So is there a JIRA ticket open for this already? Any chance of
getting it into 1.4? Its seriously kicking out butts right now. We write
into our masters with ~50ms response times till we hit the autocommit then
add/update response time is 10-30 seconds. Ouch.
I'd be willing to wo
Question 1: I see in DirectUpdateHandler2 that there is a read/Write lock
used between addDoc and commit.
My mental model of the process was this: clients can add/update documents
until the auto commit threshold was hit. At that point the commit tracker
would schedule a background commit. The
Worker.run() @bci=41, line=447
(Interpreted frame)
- java.lang.Thread.run() @bci=11, line=619 (Interpreted frame)
Yonik Seeley-2 wrote:
>
> On Sun, Mar 1, 2009 at 10:32 AM, Jim Murphy wrote:
>> I should have said - tomcat is hosting 2 webapps a solr 1.3 master and
>> slave
>> -
I should have said - tomcat is hosting 2 webapps a solr 1.3 master and slave
- as separate web apps.
Looking for anything to try.
Jim
Jim Murphy wrote:
>
> I have a 100 thread HTTP connector pool that for some reason ends up with
> all its threads blo
I have a 100 thread HTTP connector pool that for some reason ends up with all
its threads blocked here:
java.net.SocketOutputStream.socketWrite0(Native Method)
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
java.net.SocketOutputStream.write(SocketOutputStream.java:136)
org.ap
Thanks for the clarification and for untangling my questions. :)
I'm in the process of finding out why our snapshot installs take so long to
commit and didn't feel so confident about my settings, thanks.
In terms of long snapshot commits - I've isolated it to long warming times.
But since the w
I have a cluster of Solr Master/Slaves. We write tot he master and replicate
to the slaves via rsync.
Master:
1. Replication is every 5 minutes.
2. Inserting many 100's docs per minute
3. Index is: 23 million documents
4. commits are every 30 seconds
Slave:
1. Pre-warmed after rsync snaps
ouple of decision points for using Solr as opposed
> to using Nutch, or even straight Lucene?
>
> -John
>
>
>
> On Oct 22, 2008, at 11:22 AM, Jim Murphy wrote:
>
>>
>> We index RSS content using our own home grown distributed spiders -
>> not usin
We index RSS content using our own home grown distributed spiders - not using
Nutch. We use ruby processes do do the feed fetching and XML shreading, and
Amazon SQS to queue up work packets to insert into our Solr cluster.
Sorry can't be of more help.
--
View this message in context:
http://
t you expect to do.
Yonik Seeley wrote:
>
> On Mon, Oct 6, 2008 at 2:10 PM, Jim Murphy <[EMAIL PROTECTED]> wrote:
>> We have a farm of several Master-Slave pairs all managing a single very
>> large
>> "logical" index sharded across the master-slaves.
We're seeing strange behavior on one of our slave nodes after replication.
When the new searcher is created we see FileNotFoundExceptions in the log
and the index is strangely invalid/corrupted.
We may have identified the root cause but wanted to run it by the community.
We figure there is a bu
We have a farm of several Master-Slave pairs all managing a single very large
"logical" index sharded across the master-slaves. We notice on the slaves,
after an rsync update, as the index is being committed that all queries are
blocked sometimes resulting in unacceptable service times. I'm look
Thanks, Shalin.
Shalin Shekhar Mangar wrote:
>
> On Wed, Oct 1, 2008 at 12:08 AM, Jim Murphy <[EMAIL PROTECTED]> wrote:
>
>>
>> Question1: Is this the best place to do this?
>
>
> This sounds like a job for
> http://wiki.apache.org/solr/Updat
It may not be all that relevant but our Update handler extends from
DirectUpdateHandler2.
--
View this message in context:
http://www.nabble.com/Calculated-Unique-Key-Field-tp19747955p19748032.html
Sent from the Solr - User mailing list archive at Nabble.com.
My unique key field is an MD5 hash of several other fields that represent
identity of documents in my index. We've been calculating this externally
and setting the key value in documents but have found recurring bugs as the
number and variety of inserting consumers has grown...
So I wanted to mo
*Excellent* so a custom QueryComponent it is.
The Solr score doesn't factor in too much - our search needs are modest -
just does it contain the keyword (or variants, stems etc) or not. So the
query trims down from ~100M to 10-1. That way the more expensive
filtering operates at the smaller
I'm still trying to filter my search results with external data. I have ~100
million documents in the index. I want to use the power of lucene to knock
that index down to 10-100 with keyword searching and a few other regular
query terms. With that smaller subset I'd like to apply a filter based
I'm looking to incorporate an external calculation in Solr/Lucene search
results. I'd like to write queries that filter and sort on the value of
this "virtual field". The value of the field is actually calculated at
runtime based on a remote call to an external system. My Solr queries will
incl
gt; l
>
> wunder
>
> On 7/29/08 5:59 PM, "Jim Murphy" <[EMAIL PROTECTED]> wrote:
>
>>
>> If figured that it would be - but the rankings are dynamically
>> calculated.
>> I'd like to limit the number of calculations performed for this very
hanks...
Jim
Yonik Seeley wrote:
>
> Calling out will be an order of magnitude (or two) slower compared to
> moving the rankings into Solr, but it is doable. See ValueSource
> (it's used by FunctionQuery).
>
> -Yonik
>
> On Tue, Jul 29, 2008 at 8:23 PM, Jim Murphy
-Yonik
>
> On Tue, Jul 29, 2008 at 7:08 PM, Jim Murphy <[EMAIL PROTECTED]> wrote:
>>
>> I need to store 100 million documents in our Solr instance and be able to
>> retrieve them with simple term queries - keyword matches. I'm NOT
>> implementing a sea
I need to store 100 million documents in our Solr instance and be able to
retrieve them with simple term queries - keyword matches. I'm NOT
implementing a search application where documents are scored and
ranked...they either match the keywords or not. Also, I have an external
ranking system tha
27 matches
Mail list logo