Re: overseer queue clogged

2013-03-22 Thread Gary Yngve
Thanks, Mark!

The core node names in the solr.xml in solr4.2 is great!  Maybe in 4.3 it
can be supported via API?

Also I am glad you mentioned in other post the chance to namespace
zookeeper by adding a path to the end of the comma-delim zk hosts.  That
works out really well in our situation for having zk serve multiple amazon
environments that go up and down independently of each other -- no issues
w/ shared clusterstate.json or overseers.

Regarding our original problem, we were able to restart all our shards but
one, which wasn't getting past
Mar 20, 2013 5:12:54 PM org.apache.solr.common.cloud.ZkStateReader$2 process
INFO: A cluster state change has occurred - updating...
Mar 20, 2013 5:12:54 PM org.apache.zookeeper.ClientCnxn$EventThread
processEvent
SEVERE: Error while calling watcher
java.lang.NullPointerException
at
org.apache.solr.common.cloud.ZkStateReader$2.process(ZkStateReader.java:201)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:526)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502)

We ended up upgrading to solr4.2 and rebuilding the whole index from our
datastore.

-Gary


On Sat, Mar 16, 2013 at 9:51 AM, Mark Miller  wrote:

> Yeah, I don't know that I've ever tried with 4.0, but I've done this with
> 4.1 and 4.2.
>
> - Mark
>
> On Mar 16, 2013, at 12:19 PM, Gary Yngve  wrote:
>
> > Cool, I'll need to try this.  I could have sworn that it didn't work that
> > way in 4.0, but maybe my test was bunk.
> >
> > -g
> >
> >
> > On Fri, Mar 15, 2013 at 9:41 PM, Mark Miller 
> wrote:
> >>
> >> You can do this - just modify your starting Solr example to have no
> cores
> >> in solr.xml. You won't be able to make use of the admin UI until you
> create
> >> at least one core, but the core and collection apis will both work fine.
>
>


doc cache issues... query-time way to bypass cache?

2013-03-22 Thread Gary Yngve
I have a situation we just discovered in solr4.2 where there are previously
cached results from a limited field list, and when querying for the whole
field list, it responds differently depending on which shard gets the query
(no extra replicas).  It either returns the document on the limited field
list or the full field list.

We're releasing tonight, so is there a query param to selectively bypass
the cache, which I can use as a temp fix?

Thanks,
Gary


Re: doc cache issues... query-time way to bypass cache?

2013-03-23 Thread Gary Yngve
Sigh, user error.

I missed this in the 4.1 release notes:

Collections that do not specify numShards at collection creation time use
custom sharding and default to the "implicit" router. Document updates
received by a shard will be indexed to that shard, unless a "*shard*"
parameter or document field names a different shard.


On Fri, Mar 22, 2013 at 3:39 PM, Gary Yngve  wrote:

> I have a situation we just discovered in solr4.2 where there are
> previously cached results from a limited field list, and when querying for
> the whole field list, it responds differently depending on which shard gets
> the query (no extra replicas).  It either returns the document on the
> limited field list or the full field list.
>
> We're releasing tonight, so is there a query param to selectively bypass
> the cache, which I can use as a temp fix?
>
> Thanks,
> Gary
>


Re: incorrect solr update behavior

2013-01-14 Thread Gary Yngve
Of course, as soon as I post this, I discover this:

https://issues.apache.org/jira/browse/SOLR-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13537900#comment-13538174

i'll give this patch a spin in the morning.

(this is not an example of how to use antecedents :))

-g


On Mon, Jan 14, 2013 at 6:27 PM, Gary Yngve  wrote:

> Posting this
>
>  update="set">blah update="add">qux update="add">quuxfoo
>
> to an existing doc with foo and bar tags
> results in tags_ss containing
> 
> {add=qux}
> {add=quux}
> 
>
> whereas posting this
>
>  update="set">blah update="add">quxfoo
>
> results in the expected behavior:
> 
> foo
> bar
> qux
> 
>
> Any ideas?
>
> Thanks,
> Gary
>


solr4.1 createNodeSet requires ip addresses?

2013-02-15 Thread Gary Yngve
Hi all,

I've been unable to get the collections create API to work with
createNodeSet containing hostnames, both localhost and external hostnames.
 I've only been able to get it working when using explicit IP addresses.

It looks like zk stores the IP addresses in the clusterstate.json and
live_nodes.  Is it possible that Solr Cloud is not doing any hostname
resolving but just looking for an explicit match with createNodeSet?  This
is kind of annoying, in that I am working with EC2 instances and consider
it pretty lame to need to use elastic IPs for internal use.  I'm hacking
around it now (looking up the eth0 inet addr on each machine), but I'm not
happy about it.

Has anyone else found a better solution?

The reason I want to specify explicit nodes for collections is so I can
have just one zk ensemble managing collections across different
environments that will go up and down independently of each other.

Thanks,
Gary


Re: How to use shardId

2013-02-20 Thread Gary Yngve
the param in solr.xml should be shard, not shardId.  i tripped over this
too.

-g



On Mon, Jan 14, 2013 at 7:01 AM, starbuck wrote:

> Hi all,
>
> I am trying to realize a solr cloud cluster with 2 collections and 4 shards
> each with 2 replicates hosted by 4 solr instances. If shardNum parm is set
> to 4 and all solr instances are started after each other it seems to work
> fine.
>
> What I wanted to do now is removing shardNum from JAVA_OPTS and defining
> each core with a "shardId". Here is my current solr.xml of the first and
> second (in the second there is another instanceDir, the rest is the same)
> solr instance:
>
>
>
> Here is solr.xml of the third and fourth solr instance:
>
>
>
> But it seems that solr doesn't accept the shardId or omits it. What I
> really
> get is 2 collections each with 2 shards and 8 replicates (each solr
> instance
> 2)
> Either the functionality is not really clear to me or there has to be a
> config failure.
>
> It would very helpful if anyone could give me a hint.
>
> Thanks.
> starbuck
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/How-to-use-shardId-tp4033186.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
Sorry, should have specified.  4.1




On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller  wrote:

> What Solr version? 4.0, 4.1 4.2?
>
> - Mark
>
> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
>
> > my solr cloud has been running fine for weeks, but about a week ago, it
> > stopped dequeueing from the overseer queue, and now there are thousands
> of
> > tasks on the queue, most which look like
> >
> > {
> >  "operation":"state",
> >  "numShards":null,
> >  "shard":"shard3",
> >  "roles":null,
> >  "state":"recovering",
> >  "core":"production_things_shard3_2",
> >  "collection":"production_things",
> >  "node_name":"10.31.41.59:8883_solr",
> >  "base_url":"http://10.31.41.59:8883/solr"}
> >
> > i'm trying to create a new collection through collection API, and
> > obviously, nothing is happening...
> >
> > any suggestion on how to fix this?  drop the queue in zk?
> >
> > how could did it have gotten in this state in the first place?
> >
> > thanks,
> > gary
>
>


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
Also, looking at overseer_elect, everything looks fine.  node is valid and
live.


On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve  wrote:

> Sorry, should have specified.  4.1
>
>
>
>
> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller wrote:
>
>> What Solr version? 4.0, 4.1 4.2?
>>
>> - Mark
>>
>> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
>>
>> > my solr cloud has been running fine for weeks, but about a week ago, it
>> > stopped dequeueing from the overseer queue, and now there are thousands
>> of
>> > tasks on the queue, most which look like
>> >
>> > {
>> >  "operation":"state",
>> >  "numShards":null,
>> >  "shard":"shard3",
>> >  "roles":null,
>> >  "state":"recovering",
>> >  "core":"production_things_shard3_2",
>> >  "collection":"production_things",
>> >  "node_name":"10.31.41.59:8883_solr",
>> >  "base_url":"http://10.31.41.59:8883/solr"}
>> >
>> > i'm trying to create a new collection through collection API, and
>> > obviously, nothing is happening...
>> >
>> > any suggestion on how to fix this?  drop the queue in zk?
>> >
>> > how could did it have gotten in this state in the first place?
>> >
>> > thanks,
>> > gary
>>
>>
>


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
I restarted the overseer node and another took over, queues are empty now.

the server with core production_things_shard1_2
is having these errors:

shard update error RetryNode:
http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException:
Server refused connection at:
http://10.104.59.189:8883/solr/production_things_shard11_replica1

  for shard11!!!

I also got some strange errors on the restarted node.  Makes me wonder if
there is a string-matching bug for shard1 vs shard11?

SEVERE: :org.apache.solr.common.SolrException: Error getting leader from zk
  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
  at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
  at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
  at org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
  at org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
  at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
  at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.solr.common.SolrException: There is conflicting
information about the leader
of shard: shard1 our state
says:http://10.104.59.189:8883/solr/collection1/but zookeeper
says:http
://10.217.55.151:8883/solr/collection1/
  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)

INFO: Releasing
directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
d11_replica1/data/index
Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
  at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
  at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)

SEVERE: org.apache.solr.common.SolrException: I was asked to wait on state
recovering for 10.76.31.
67:8883_solr but I still do not see the requested state. I see state:
active live:true
  at
org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
.java:948)




On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller  wrote:

> Strange - we hardened that loop in 4.1 - so I'm not sure what happened
> here.
>
> Can you do a stack dump on the overseer and see if you see an Overseer
> thread running perhaps? Or just post the results?
>
> To recover, you should be able to just restart the Overseer node and have
> someone else take over - they should pick up processing the queue.
>
> Any logs you might be able to share could be useful too.
>
> - Mark
>
> On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:
>
> > Also, looking at overseer_elect, everything looks fine.  node is valid
> and
> > live.
> >
> >
> > On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve 
> wrote:
> >
> >> Sorry, should have specified.  4.1
> >>
> >>
> >>
> >>
> >> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller  >wrote:
> >>
> >>> What Solr version? 4.0, 4.1 4.2?
> >>>
> >>> - Mark
> >>>
> >>> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
> >>>
> >>>> my solr cloud has been running fine for weeks, but about a week ago,
> it
> >>>> stopped dequeueing from the overseer queue, and now there are
> thousands
> >>> of
> >>>> tasks on the queue, most which look like
> >>>>
> >>>> {
> >>>> "operation":"state",
> >>>> "numShards":null,
> >>>> "shard":"shard3",
> >>>> "roles":null,
> >>>> "state":"recovering",
> >>>> "core":"production_things_shard3_2",
> >>>> "collection":"production_things",
> >>>> "node_name":"10.31.41.59:8883_solr",
> >>>> "base_url":"http://10.31.41.59:8883/solr"}
> >>>>
> >>>> i'm trying to create a new collection through collection API, and
> >>>> obviously, nothing is happening...
> >>>>
> >>>> any suggestion on how to fix this?  drop the queue in zk?
> >>>>
> >>>> how could did it have gotten in this state in the first place?
> >>>>
> >>>> thanks,
> >>>> gary
> >>>
> >>>
> >>
>
>


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
are red now in the solr cloud graph.. trying to figure out what that
means...


On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve  wrote:

> I restarted the overseer node and another took over, queues are empty now.
>
> the server with core production_things_shard1_2
> is having these errors:
>
> shard update error RetryNode:
> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException:
> Server refused connection at:
> http://10.104.59.189:8883/solr/production_things_shard11_replica1
>
>   for shard11!!!
>
> I also got some strange errors on the restarted node.  Makes me wonder if
> there is a string-matching bug for shard1 vs shard11?
>
> SEVERE: :org.apache.solr.common.SolrException: Error getting leader from zk
>   at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
>   at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
>   at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
>   at
> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
>   at
> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
>   at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
>   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
>   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>   at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>   at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>   at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.solr.common.SolrException: There is conflicting
> information about the leader
> of shard: shard1 our state says:
> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
> ://10.217.55.151:8883/solr/collection1/
>   at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
>
> INFO: Releasing
> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
> d11_replica1/data/index
> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
>   at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
>   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)
>
> SEVERE: org.apache.solr.common.SolrException: I was asked to wait on state
> recovering for 10.76.31.
> 67:8883_solr but I still do not see the requested state. I see state:
> active live:true
>   at
> org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
> .java:948)
>
>
>
>
> On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller wrote:
>
>> Strange - we hardened that loop in 4.1 - so I'm not sure what happened
>> here.
>>
>> Can you do a stack dump on the overseer and see if you see an Overseer
>> thread running perhaps? Or just post the results?
>>
>> To recover, you should be able to just restart the Overseer node and have
>> someone else take over - they should pick up processing the queue.
>>
>> Any logs you might be able to share could be useful too.
>>
>> - Mark
>>
>> On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:
>>
>> > Also, looking at overseer_elect, everything looks fine.  node is valid
>> and
>> > live.
>> >
>> >
>> > On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve 
>> wrote:
>> >
>> >> Sorry, should have specified.  4.1
>> >>
>> >>
>> >>
>> >>
>> >> On Fri, Mar 15, 2013 at 4:33 PM, Mark Miller > >wrote:
>> >>
>> >>> What Solr version? 4.0, 4.1 4.2?
>> >>>
>> >>> - Mark
>> >>>
>> >>> On Mar 15, 2013, at 7:19 PM, Gary Yngve  wrote:
>> >>>
>> >>>> my solr cloud has been running fine for weeks, but about a week ago,
>> it
>> >>>> stopped dequeueing from the overseer queue, and now there are
>> thousands
>> >>> of
>> >>>> tasks on the queue, most which look like
>> >>>>
>> >>>> {
>> >>>> "operation":"state",
>> >>>> "numShards":null,
>> >>>> "shard":"shard3",
>> >>>> "roles":null,
>> >>>> "state":"recovering",
>> >>>> "core":"production_things_shard3_2",
>> >>>> "collection":"production_things",
>> >>>> "node_name":"10.31.41.59:8883_solr",
>> >>>> "base_url":"http://10.31.41.59:8883/solr"}
>> >>>>
>> >>>> i'm trying to create a new collection through collection API, and
>> >>>> obviously, nothing is happening...
>> >>>>
>> >>>> any suggestion on how to fix this?  drop the queue in zk?
>> >>>>
>> >>>> how could did it have gotten in this state in the first place?
>> >>>>
>> >>>> thanks,
>> >>>> gary
>> >>>
>> >>>
>> >>
>>
>>
>


Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
i think those followers are red from trying to forward requests to the
overseer while it was being restarted.  i guess i'll see if they become
green over time.  or i guess i can restart them one at a time..


On Fri, Mar 15, 2013 at 6:53 PM, Gary Yngve  wrote:

> it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
> are red now in the solr cloud graph.. trying to figure out what that
> means...
>
>
> On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve  wrote:
>
>> I restarted the overseer node and another took over, queues are empty now.
>>
>> the server with core production_things_shard1_2
>> is having these errors:
>>
>> shard update error RetryNode:
>> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException:
>> Server refused connection at:
>> http://10.104.59.189:8883/solr/production_things_shard11_replica1
>>
>>   for shard11!!!
>>
>> I also got some strange errors on the restarted node.  Makes me wonder if
>> there is a string-matching bug for shard1 vs shard11?
>>
>> SEVERE: :org.apache.solr.common.SolrException: Error getting leader from
>> zk
>>   at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
>>   at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
>>   at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
>>   at
>> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
>>   at
>> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
>>   at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
>>   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
>>   at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at
>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>   at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>   at
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>   at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>   at java.lang.Thread.run(Thread.java:722)
>> Caused by: org.apache.solr.common.SolrException: There is conflicting
>> information about the leader
>> of shard: shard1 our state says:
>> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
>> ://10.217.55.151:8883/solr/collection1/
>>   at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
>>
>> INFO: Releasing
>> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
>> d11_replica1/data/index
>> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
>> SEVERE: org.apache.solr.common.SolrException: Error opening new searcher
>>   at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1423)
>>   at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1535)
>>
>> SEVERE: org.apache.solr.common.SolrException: I was asked to wait on
>> state recovering for 10.76.31.
>> 67:8883_solr but I still do not see the requested state. I see state:
>> active live:true
>>   at
>> org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler
>> .java:948)
>>
>>
>>
>>
>> On Fri, Mar 15, 2013 at 5:05 PM, Mark Miller wrote:
>>
>>> Strange - we hardened that loop in 4.1 - so I'm not sure what happened
>>> here.
>>>
>>> Can you do a stack dump on the overseer and see if you see an Overseer
>>> thread running perhaps? Or just post the results?
>>>
>>> To recover, you should be able to just restart the Overseer node and
>>> have someone else take over - they should pick up processing the queue.
>>>
>>> Any logs you might be able to share could be useful too.
>>>
>>> - Mark
>>>
>>> On Mar 15, 2013, at 7:51 PM, Gary Yngve  wrote:
>>>
>>> > Also, looking at overseer_elect, everything looks fine.  node is valid
>>> and
>>> > live.
>>> >
>>> >
>>> > On Fri, Mar 15, 2013 at 4:47 PM, Gary Yngve 
>>> wrote:
>>> >
>>> >> Sorry, should have specified.  4.1
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>

Re: overseer queue clogged

2013-03-15 Thread Gary Yngve
I will upgrade to 4.2 this weekend and see what happens.  We are on ec2 and
have had a few issues with hostnames with both zk and solr. (but in this
case i haven't rebooted any instances either)

it's relatively a pain to do the upgrade because we have a query/scorer
fork of lucene along with supplemental jars, and zk cannot distribute
binary jars via the config.

we are also multi-collection per zk... i wish it didn't require a core
always defined up front for the core admin?  i would love to have an
instance have no cores and then just create the core i need..

-g



On Fri, Mar 15, 2013 at 7:14 PM, Mark Miller  wrote:

>
> On Mar 15, 2013, at 10:04 PM, Gary Yngve  wrote:
>
> > i think those followers are red from trying to forward requests to the
> > overseer while it was being restarted.  i guess i'll see if they become
> > green over time.  or i guess i can restart them one at a time..
>
> Restarting the cluster clear things up. It shouldn't take too long for
> those nodes to recover though - they should have been up to date before.
> The couple exceptions you posted def indicate something is out of whack.
> It's something I'd like to get to the bottom of.
>
> - Mark
>
> >
> >
> > On Fri, Mar 15, 2013 at 6:53 PM, Gary Yngve 
> wrote:
> >
> >> it doesn't appear to be a shard1 vs shard11 issue... 60% of my followers
> >> are red now in the solr cloud graph.. trying to figure out what that
> >> means...
> >>
> >>
> >> On Fri, Mar 15, 2013 at 6:48 PM, Gary Yngve 
> wrote:
> >>
> >>> I restarted the overseer node and another took over, queues are empty
> now.
> >>>
> >>> the server with core production_things_shard1_2
> >>> is having these errors:
> >>>
> >>> shard update error RetryNode:
> >>>
> http://10.104.59.189:8883/solr/production_things_shard11_replica1/:org.apache.solr.client.solrj.SolrServerException
> :
> >>> Server refused connection at:
> >>> http://10.104.59.189:8883/solr/production_things_shard11_replica1
> >>>
> >>>  for shard11!!!
> >>>
> >>> I also got some strange errors on the restarted node.  Makes me wonder
> if
> >>> there is a string-matching bug for shard1 vs shard11?
> >>>
> >>> SEVERE: :org.apache.solr.common.SolrException: Error getting leader
> from
> >>> zk
> >>>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:771)
> >>>  at org.apache.solr.cloud.ZkController.register(ZkController.java:683)
> >>>  at org.apache.solr.cloud.ZkController.register(ZkController.java:634)
> >>>  at
> >>> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:890)
> >>>  at
> >>> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:874)
> >>>  at org.apache.solr.core.CoreContainer.register(CoreContainer.java:823)
> >>>  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:633)
> >>>  at org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:624)
> >>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> >>>  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> >>>  at
> >>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> >>>  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> >>>  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> >>>  at
> >>>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >>>  at
> >>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >>>  at java.lang.Thread.run(Thread.java:722)
> >>> Caused by: org.apache.solr.common.SolrException: There is conflicting
> >>> information about the leader
> >>> of shard: shard1 our state says:
> >>> http://10.104.59.189:8883/solr/collection1/ but zookeeper says:http
> >>> ://10.217.55.151:8883/solr/collection1/
> >>>  at org.apache.solr.cloud.ZkController.getLeader(ZkController.java:756)
> >>>
> >>> INFO: Releasing
> >>>
> directory:/vol/ubuntu/talemetry_match_solr/solr_server/solr/production_things_shar
> >>> d11_replica1/data/index
> >>> Mar 15, 2013 5:52:34 PM org.apache.solr.common.SolrException log
> >>> SEVERE: org.apache.solr.common.SolrException: Error opening new
> searcher
> >>>  at org.apache.solr.core.SolrCore.o

Re: overseer queue clogged

2013-03-16 Thread Gary Yngve
Cool, I'll need to try this.  I could have sworn that it didn't work that
way in 4.0, but maybe my test was bunk.

-g


On Fri, Mar 15, 2013 at 9:41 PM, Mark Miller  wrote:
>
> You can do this - just modify your starting Solr example to have no cores
> in solr.xml. You won't be able to make use of the admin UI until you create
> at least one core, but the core and collection apis will both work fine.


example schema in branch_3x returns SEVERE errors

2010-11-27 Thread Gary Yngve
logs> grep SEVERE solr.err.log
SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.KeywordMarkerFilterFactory'
SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.KeywordMarkerFilterFactory'
SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.KeywordMarkerFilterFactory'
SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.EnglishMinimalStemFilterFactory'
SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.PointType'
SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.LatLonType'
SEVERE: org.apache.solr.common.SolrException: Error loading class
'solr.GeoHashField'
SEVERE: java.lang.RuntimeException: schema fieldtype
text(org.apache.solr.schema.TextField) invalid
arguments:{autoGeneratePhraseQueries=true}
SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'location'
specified on field store

It looks like it's loading the correct files...

010-11-27 13:01:28.005:INFO::Logging to STDERR via org.mortbay.log.StdErrLog
2010-11-27 13:01:28.137:INFO::jetty-6.1.22
2010-11-27 13:01:28.204:INFO::Extract
file:/Users/gyngve/git/gems/solr_control/solr_server/webapps/apache-solr-3.1-SNAPSHOT.war
to
/Users/gyngve/git/gems/solr_control/solr_server/work/Jetty_0_0_0_0_8983_apache.solr.3.1.SNAPSHOT.war__apache.solr.3.1.SNAPSHOT__4jaonl/webapp

And on inspection on the war and the solr-core jar inside, I can see the
missing classes, so I am pretty confused.

Has anyone else seen this before or have an idea on how to surmount it?

I'm not quite ready to file a Jira issue on it yet, as I'm hoping it's user
error.

Thanks,
Gary


Re: example schema in branch_3x returns SEVERE errors

2010-11-27 Thread Gary Yngve
Sorry, false alarm.  Had a bad merge and had a stray library linking to an
older version of another library.  Works now.

-Gary


On Sat, Nov 27, 2010 at 4:17 PM, Gary Yngve  wrote:

> logs> grep SEVERE solr.err.log
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'solr.KeywordMarkerFilterFactory'
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'solr.KeywordMarkerFilterFactory'
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'solr.KeywordMarkerFilterFactory'
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'solr.EnglishMinimalStemFilterFactory'
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'solr.PointType'
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'solr.LatLonType'
> SEVERE: org.apache.solr.common.SolrException: Error loading class
> 'solr.GeoHashField'
> SEVERE: java.lang.RuntimeException: schema fieldtype
> text(org.apache.solr.schema.TextField) invalid
> arguments:{autoGeneratePhraseQueries=true}
> SEVERE: org.apache.solr.common.SolrException: Unknown fieldtype 'location'
> specified on field store
>
> It looks like it's loading the correct files...
>
> 010-11-27 13:01:28.005:INFO::Logging to STDERR via
> org.mortbay.log.StdErrLog
> 2010-11-27 13:01:28.137:INFO::jetty-6.1.22
> 2010-11-27 13:01:28.204:INFO::Extract
> file:/Users/gyngve/git/gems/solr_control/solr_server/webapps/apache-solr-3.1-SNAPSHOT.war
> to
> /Users/gyngve/git/gems/solr_control/solr_server/work/Jetty_0_0_0_0_8983_apache.solr.3.1.SNAPSHOT.war__apache.solr.3.1.SNAPSHOT__4jaonl/webapp
>
> And on inspection on the war and the solr-core jar inside, I can see the
> missing classes, so I am pretty confused.
>
> Has anyone else seen this before or have an idea on how to surmount it?
>
> I'm not quite ready to file a Jira issue on it yet, as I'm hoping it's user
> error.
>
> Thanks,
> Gary
>


Seattle Solr/Lucene User Group?

2011-04-13 Thread Gary Yngve
Hi all,

Does anyone know if there is a Solr/Lucene user group /
birds-of-feather that meets in Seattle?

If not, I'd like to start one up.  I'd love to learn and share tricks
pertaining to NRT, performance, distributed solr, etc.

Also, I am planning on attending the Lucene Revolution!

Let's connect!

-Gary

http://www.linkedin.com/in/garyyngve