Re: SolrJ dependencies
Done see: https://issues.apache.org/jira/browse/SOLR-3541 On 12-6-2012 18:39, Sami Siren wrote: On Tue, Jun 12, 2012 at 4:22 PM, Thijs wrote: Hi I just checked out and build solr&lucene from branches/lucene_4x I wanted to upgrade my custom client to this new version (using solrj). So I copied lucene/solr/dist/apache-solr-solrj-4.0-SNAPSHOT.jar & lucene/solr/dist/apache-solr-core-4.0-SNAPSHOT.jar to my project and I updated the other libs from the libs in /solr/dist/solrj-lib However, when I wanted to run my client I got exceptions indicating that I was missing the HTTPClient jars. (httpclient, htpcore,httpmime) Shouldn't those go into lucene/solr/dist/solrj-lib as wel? Yes they should. Do I need to create a ticket for this? Please do so. -- Sami Siren
Re: Solr PHP highload search
How much memory are you giving the JVM? Have you put a performance monitor on the running process to see what resources have been exhausted (i.e. are you I/O bound? CPU bound?) Best Erick On Tue, Jun 12, 2012 at 3:40 AM, Alexandr Bocharov wrote: > Hi, all. > > I need advice for configuring Solr search to use at highload production. > > I've wrote user's search engine (PHP class), that uses over 70 parameters > for searching users. > User's database is over 30 millions records. > Index total size is 6.4G when I use 1 node and 3.2G when 2 nodes. > Previous search engine can handle 700,000 queries per day for searching > users - it is ~8 queries/sec (4 mysql servers with manual sharding via > Gearman) > > Example of queries are: > > [responseHeader] => SolrObject Object > ( > [status] => 0 > [QTime] => 517 > [params] => SolrObject Object > ( > [bq] => Array > ( > [0] => bool_field1:1^30 > [1] => str_field1:str_value1^15 > [2] => tint_field1:tint_field1^5 > [3] => bool_field2:1^6 > [4] => date_field1:[NOW-14DAYS TO NOW]^20 > [5] => date_field2:[NOW-14DAYS TO NOW]^5 > ) > > [indent] => on > [start] => 0 > [q.alt] => *:* > [wt] => xml > [fq] => Array > ( > [0] => tint_field2:[tint_value2 TO tint_value22] > [1] => str_field1:str_value1 > [2] => str_field2:str_value2 > [3] => tint_field3:(tint_value3 OR tint_value32 > OR tint_value33 OR tint_value34 OR tint_value5) > [4] => tint_field4:tint_value4 > [5] => -bool_field1:[* TO *] > ) > > [version] => 2.2 > [defType] => dismax > [rows] => 10 > ) > > ) > > > I test my PHP search API and found that concurrent random queries, for > example 10 queries at one time increases QTime from avg 500 ms to 3000 ms > at 2 nodes. > > 1. How can I tweak my queries or parameters or Solr's config to decrease > QTime? > 2. What if I put my index data to emulated RAM directory, can it increase > greatly performance? > 3. Sorting by boost queries has a great influence on QTime, how can I > optimize boost queries? > 4. If I split my 2 nodes on 2 machines into 6 nodes on 2 machines, 3 nodes > per machine, will it increase performance? > 5. What is "multi-core query", how can I configure it, and will it increase > performance? > > Thank you!
Re: Solr PHP highload search
Thank you for help :) I'm giving 2048M the JVM for each node. CPU load is jumping 70-90%. Memory usage is increasing to max during testing (probably cache is filling). I/O I didn't monitor. I'd like to see answers on my other questions. 2012/6/13 Erick Erickson > How much memory are you giving the JVM? Have you put a performance > monitor on the running process to see what resources have been > exhausted (i.e. are you I/O bound? CPU bound?) > > Best > Erick > > On Tue, Jun 12, 2012 at 3:40 AM, Alexandr Bocharov > wrote: > > Hi, all. > > > > I need advice for configuring Solr search to use at highload production. > > > > I've wrote user's search engine (PHP class), that uses over 70 parameters > > for searching users. > > User's database is over 30 millions records. > > Index total size is 6.4G when I use 1 node and 3.2G when 2 nodes. > > Previous search engine can handle 700,000 queries per day for searching > > users - it is ~8 queries/sec (4 mysql servers with manual sharding via > > Gearman) > > > > Example of queries are: > > > > [responseHeader] => SolrObject Object > >( > >[status] => 0 > >[QTime] => 517 > >[params] => SolrObject Object > >( > >[bq] => Array > >( > >[0] => bool_field1:1^30 > >[1] => str_field1:str_value1^15 > >[2] => tint_field1:tint_field1^5 > >[3] => bool_field2:1^6 > >[4] => date_field1:[NOW-14DAYS TO NOW]^20 > >[5] => date_field2:[NOW-14DAYS TO NOW]^5 > >) > > > >[indent] => on > >[start] => 0 > >[q.alt] => *:* > >[wt] => xml > >[fq] => Array > >( > >[0] => tint_field2:[tint_value2 TO > tint_value22] > >[1] => str_field1:str_value1 > >[2] => str_field2:str_value2 > >[3] => tint_field3:(tint_value3 OR > tint_value32 > > OR tint_value33 OR tint_value34 OR tint_value5) > >[4] => tint_field4:tint_value4 > >[5] => -bool_field1:[* TO *] > >) > > > >[version] => 2.2 > >[defType] => dismax > >[rows] => 10 > >) > > > >) > > > > > > I test my PHP search API and found that concurrent random queries, for > > example 10 queries at one time increases QTime from avg 500 ms to 3000 ms > > at 2 nodes. > > > > 1. How can I tweak my queries or parameters or Solr's config to decrease > > QTime? > > 2. What if I put my index data to emulated RAM directory, can it increase > > greatly performance? > > 3. Sorting by boost queries has a great influence on QTime, how can I > > optimize boost queries? > > 4. If I split my 2 nodes on 2 machines into 6 nodes on 2 machines, 3 > nodes > > per machine, will it increase performance? > > 5. What is "multi-core query", how can I configure it, and will it > increase > > performance? > > > > Thank you! >
Re: Exception when optimizing index
On Thu, Jun 7, 2012 at 5:50 AM, Rok Rejc wrote: > - java.runtime.nameOpenJDK Runtime Environment > - java.runtime.version1.6.0_22-b22 ... > > As far as I see from the JIRA issue I have the patch attached (as mentioned > I have a trunk version from May 12). Any ideas? > its not guaranteed that the patch will workaround all hotspot bugs related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5091921 Since you can reproduce, is it possible for you to re-test the scenario with a newer JVM (e.g. 1.7.0_04) just to rule that out? -- lucidimagination.com
Re: Solr PHP highload search
Consider just looking at it with jconsole (should be in your Java release) to get a sense of the memory usage/collection. How much physical memory do you have overall? Because this is not what I'd expect. Your CPU load is actually reasonably high, so it doesn't look like you're swapping. By and large, trying to use RAMDirectories isn't a good solution, between the OS and Solr, they read the necessary parts of your index into memory and use that. Best Erick On Wed, Jun 13, 2012 at 7:13 AM, Alexandr Bocharov wrote: > Thank you for help :) > > I'm giving 2048M the JVM for each node. > CPU load is jumping 70-90%. > Memory usage is increasing to max during testing (probably cache is > filling). > I/O I didn't monitor. > > I'd like to see answers on my other questions. > > 2012/6/13 Erick Erickson > >> How much memory are you giving the JVM? Have you put a performance >> monitor on the running process to see what resources have been >> exhausted (i.e. are you I/O bound? CPU bound?) >> >> Best >> Erick >> >> On Tue, Jun 12, 2012 at 3:40 AM, Alexandr Bocharov >> wrote: >> > Hi, all. >> > >> > I need advice for configuring Solr search to use at highload production. >> > >> > I've wrote user's search engine (PHP class), that uses over 70 parameters >> > for searching users. >> > User's database is over 30 millions records. >> > Index total size is 6.4G when I use 1 node and 3.2G when 2 nodes. >> > Previous search engine can handle 700,000 queries per day for searching >> > users - it is ~8 queries/sec (4 mysql servers with manual sharding via >> > Gearman) >> > >> > Example of queries are: >> > >> > [responseHeader] => SolrObject Object >> > ( >> > [status] => 0 >> > [QTime] => 517 >> > [params] => SolrObject Object >> > ( >> > [bq] => Array >> > ( >> > [0] => bool_field1:1^30 >> > [1] => str_field1:str_value1^15 >> > [2] => tint_field1:tint_field1^5 >> > [3] => bool_field2:1^6 >> > [4] => date_field1:[NOW-14DAYS TO NOW]^20 >> > [5] => date_field2:[NOW-14DAYS TO NOW]^5 >> > ) >> > >> > [indent] => on >> > [start] => 0 >> > [q.alt] => *:* >> > [wt] => xml >> > [fq] => Array >> > ( >> > [0] => tint_field2:[tint_value2 TO >> tint_value22] >> > [1] => str_field1:str_value1 >> > [2] => str_field2:str_value2 >> > [3] => tint_field3:(tint_value3 OR >> tint_value32 >> > OR tint_value33 OR tint_value34 OR tint_value5) >> > [4] => tint_field4:tint_value4 >> > [5] => -bool_field1:[* TO *] >> > ) >> > >> > [version] => 2.2 >> > [defType] => dismax >> > [rows] => 10 >> > ) >> > >> > ) >> > >> > >> > I test my PHP search API and found that concurrent random queries, for >> > example 10 queries at one time increases QTime from avg 500 ms to 3000 ms >> > at 2 nodes. >> > >> > 1. How can I tweak my queries or parameters or Solr's config to decrease >> > QTime? >> > 2. What if I put my index data to emulated RAM directory, can it increase >> > greatly performance? >> > 3. Sorting by boost queries has a great influence on QTime, how can I >> > optimize boost queries? >> > 4. If I split my 2 nodes on 2 machines into 6 nodes on 2 machines, 3 >> nodes >> > per machine, will it increase performance? >> > 5. What is "multi-core query", how can I configure it, and will it >> increase >> > performance? >> > >> > Thank you! >>
Re: Sharding in SolrCloud
Mark Miller schrieb am 12.06.2012 19:19:01: > > > On Jun 12, 2012, at 3:39 AM, lenz...@gfi.ihk.de wrote: > > > Hello, > > > > we tested SolrCloud in a setup with one collection, two shards and one > > replica per shard and it works quite fine with some example data. > > Now, we plan to set up our own collection and determine in how many shards > > we should devide it. > > We can estimate quite exactly the size of the collection, but we don't > > know, what the best approach for sharding is, > > even if we know the size and the amount of queries and updates. > > Is there any documentation or a kind of design guidelines for sharding a > > collection in SolrCloud? > > > > > > Thanks & regards, > > Norman Lenzner > > > It's hard to tell - I think you want to start with an idea of how > many docs you can fit on a single node. This can vary wildly > depending on many factors. Generally you have to do some testing > with your particular config and data. You can search the mailing > lists and perhaps dig up a little info, but there is really no > replacement for running some tests with real data. > > Then you have to plan in your growth rate - resharding is naturally > a relatively expensive operation. Once you have an idea of how many > docs per machine you think seems comfortable, figure out how > machines you need given your estimated doc growth rate and perhaps > some padding. You might not get it right, but if you expect the > possibility of a lot of growth, erring on the more shards side is > obviously better. > > - Mark Miller > lucidimagination.com > Hello and thanks for your reply, We will run some tests to determine the size of our collection, but I think, there won't be the need of a second shard at all. The problem is not the size or the growth of the docs, but there will be a quite high update frequency. So, if we have many bulk updates, is it reasonable to distribute the update load on multiple shards? Thanks & regards, Norman Lenzner
Re: Different sort for each facet
Hmm, it seems that if I leave off the initial "facet.sort=index" then it will sort each by index by default, and I can use the "f.people.facet.sort=count" as expected. I thought I tried that yesterday, but I suppose it slipped my mind in my sleep-deprived state. Thanks Jack! -- Chris On Tue, Jun 12, 2012 at 10:58 PM, Jack Krupansky wrote: > f.people.facet.sort=count should work. > > Make sure you don't have a conflicting setting for that same field and > attribute. > > Does the "people" facet sort by count correctly with f.sort=index? > > What are the attributes and field type for the "people" field? > > -- Jack Krupansky > > -Original Message- From: Christopher Gross > Sent: Tuesday, June 12, 2012 11:05 AM > To: solr-user > Subject: Different sort for each facet > > > In Solr 3.4, is there a way I can sort two facets differently in the same > query? > > If I have: > > http://mysolrsrvr/solr/select?q=*:*&facet=true&facet.field=people&facet.field=category > > is there a way that I can sort people by the count and category by the > name all in one query? Or do I need to do that in separate queries? > I tried using "f.people.facet.sort=count" while also having > "facet.sort=index" but both came back in alphabetical order. > > Doing more queries is OK, I'm just trying to avoid having to do too many. > > -- Chris
LockObtainFailedException after trying to create cores on second SolrCloud instance
Hi, am struggling around with creating multiple collections on a 4 instances SolrCloud setup: I have 4 virtual OpenVZ instances, where I have installed SolrCloud on each and on one is also a standalone Zookeeper running. Loading the Solr configuration into ZK works fine. Then I startup the 4 instances and everything is also running smoothly. After that I am adding one core with the name e.g. '123'. This core is correctly visible on the instance I have used for creating it. it maps like '123' > shard1 -> virtual-instance-1 After that I am creating a core with the same name '123' on the second instance and it creates it, but an exception is thrown after some while and the cluster state of the newly created core goes to 'recovering' *"123":{"shard1":{ "virtual-instance-1:8983_solr_123":{ "shard":"shard1", "roles":null, "leader":"true", "state":"active", "core":"123", "collection":"123", "node_name":"virtual-instance-1:8983_solr", "base_url":"http://virtual-instance-1:8983/solr"}, "**virtual-instance-2**:8983_solr_123":{* *"shard":"shard1", "roles":null, "state":"recovering", "core":"123", "collection":"123", "node_name":"virtual-instance-2:8983_solr", "base_url":"http://virtual-instance-2:8983/solr"}}},* The exception throws is on the first virtual instance: *Jun 13, 2012 2:18:40 PM org.apache.solr.common.SolrException log* *SEVERE: null:org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: NativeFSLock@/home/myuser/data/index/write.lock* * at org.apache.lucene.store.Lock.obtain(Lock.java:84)* * at org.apache.lucene.index.IndexWriter.(IndexWriter.java:607)* * at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:58)* * at org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:112) * * at org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:52) * * at org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:364) * * at org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) * * at org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) * * at org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) * * at org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) * * at org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) * * at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) * * at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) * * at org.apache.solr.core.SolrCore.execute(SolrCore.java:1566)* * at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442) * * at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263) * * at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) * * at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)* * at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) * * at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)* * at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) * * at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065) * * at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)* * at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192) * * at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999) * * at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) * * at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250) * * at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149) * * at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) * * at org.eclipse.jetty.server.Server.handle(Server.java:351)* * at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454) * * at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47) * * at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:900) * * at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:954) * * at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:857)* * at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)* * at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:66) * * at org.eclipse.jetty.server.bio.SocketC
Re: LockObtainFailedException after trying to create cores on second SolrCloud instance
BTW: i am running the solr instances using -Xms512M -Xmx1024M so not so little memory. Daniel On Wed, Jun 13, 2012 at 4:28 PM, Daniel Brügge < daniel.brue...@googlemail.com> wrote: > Hi, > > am struggling around with creating multiple collections on a 4 instances > SolrCloud > setup: > > I have 4 virtual OpenVZ instances, where I have installed SolrCloud on > each and > on one is also a standalone Zookeeper running. > > Loading the Solr configuration into ZK works fine. > > Then I startup the 4 instances and everything is also running smoothly. > > After that I am adding one core with the name e.g. '123'. > > This core is correctly visible on the instance I have used for creating > it. > > it maps like > > '123' > shard1 -> virtual-instance-1 > > > After that I am creating a core with the same name '123' on the second > instance and it > creates it, but an exception is thrown after some while and the cluster > state of > the newly created core goes to 'recovering' > > > *"123":{"shard1":{ > "virtual-instance-1:8983_solr_123":{ > "shard":"shard1", > "roles":null, > "leader":"true", > "state":"active", > "core":"123", > "collection":"123", > "node_name":"virtual-instance-1:8983_solr", > "base_url":"http://virtual-instance-1:8983/solr"}, > "**virtual-instance-2**:8983_solr_123":{* > *"shard":"shard1", > "roles":null, > "state":"recovering", > "core":"123", > "collection":"123", > "node_name":"virtual-instance-2:8983_solr", > "base_url":"http://virtual-instance-2:8983/solr"}}},* > > > The exception throws is on the first virtual instance: > > *Jun 13, 2012 2:18:40 PM org.apache.solr.common.SolrException log* > *SEVERE: null:org.apache.lucene.store.LockObtainFailedException: Lock > obtain timed out: NativeFSLock@/home/myuser/data/index/write.lock* > * at org.apache.lucene.store.Lock.obtain(Lock.java:84)* > * at org.apache.lucene.index.IndexWriter.(IndexWriter.java:607)* > * at > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:58)* > * at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:112) > * > * at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:52) > * > * at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:364) > * > * at > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) > * > * at > org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) > * > * at > org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) > * > * at > org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) > * > * at > org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) > * > * at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > * > * at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > * > * at org.apache.solr.core.SolrCore.execute(SolrCore.java:1566)* > * at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442) > * > * at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263) > * > * at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) > * > * at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484) > * > * at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) > * > * at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524) > * > * at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) > * > * at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065) > * > * at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)* > * at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192) > * > * at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999) > * > * at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) > * > * at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250) > * > * at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149) > * > * at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) > * > * at org.eclipse.jetty.server.Server.handle(Server.java:351)* > * at > org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:454) > * > * at > org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:47) > * > * at > org.eclipse.jetty.server
Re: Different sort for each facet
I'm glad that you have something working, but you shouldn't have to remove that facet.sort=index. I tried the following and it works with the Solr 3.6 example after I indexed with exampledocs/books.json: http://localhost:8983/solr/select/?q=*:*&facet=true&facet.field=name&facet.field=genre_s&facet.sort=index&f.name.facet.sort=count I see the name field sorted by count and the genre_s field sorted by lexical order (note: "IT" comes before "fantasy" because upper case comes before lower case - it would be nice to have a case-neutral sort.) Could you try it, just to see if maybe we are not communicating about what exactly is not working for you? What release of Solr are you using? I am not aware of any fixes/changes that would make this behave differently as of 3.6. BTW, the default sort is "index" IFF facet.limit <= 0. The default for facet.limit is 100, so sort should default to "count". I presume you have facet.limit set to -1 or 0. You might also check to see what facet parameters might be set in your request handler as opposed to on the actual query request. -- Jack Krupansky -Original Message- From: Christopher Gross Sent: Wednesday, June 13, 2012 9:19 AM To: solr-user@lucene.apache.org Subject: Re: Different sort for each facet Hmm, it seems that if I leave off the initial "facet.sort=index" then it will sort each by index by default, and I can use the "f.people.facet.sort=count" as expected. I thought I tried that yesterday, but I suppose it slipped my mind in my sleep-deprived state. Thanks Jack! -- Chris On Tue, Jun 12, 2012 at 10:58 PM, Jack Krupansky wrote: f.people.facet.sort=count should work. Make sure you don't have a conflicting setting for that same field and attribute. Does the "people" facet sort by count correctly with f.sort=index? What are the attributes and field type for the "people" field? -- Jack Krupansky -Original Message- From: Christopher Gross Sent: Tuesday, June 12, 2012 11:05 AM To: solr-user Subject: Different sort for each facet In Solr 3.4, is there a way I can sort two facets differently in the same query? If I have: http://mysolrsrvr/solr/select?q=*:*&facet=true&facet.field=people&facet.field=category is there a way that I can sort people by the count and category by the name all in one query? Or do I need to do that in separate queries? I tried using "f.people.facet.sort=count" while also having "facet.sort=index" but both came back in alphabetical order. Doing more queries is OK, I'm just trying to avoid having to do too many. -- Chris
Re: [DIH] Multiple repeat XPath stmts
TNX. A lifesaver... -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-Multiple-repeat-XPath-stmts-tp499770p3989439.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Getting maximum / minimum field value - slow query
What is more, I tried to get the maximum value using stats query This time the response time was about 30 seconds and server ate 1.5 Gb of memory when calculating the response. But there were no statistics in response: 0 27578 *.* true Id 0 What's wrong here? -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-maximum-minimum-field-value-slow-query-tp3989467p3989468.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: LockObtainFailedException after trying to create cores on second SolrCloud instance
Thats an interesting data dir location: NativeFSLock@/home/myuser/ data/index/write.lock Where are the other data dirs located? Are you sharing one drive or something? It looks like something already has a writer lock - are you sure another solr instance is not running somehow? On Wed, Jun 13, 2012 at 11:11 AM, Daniel Brügge < daniel.brue...@googlemail.com> wrote: > BTW: i am running the solr instances using -Xms512M -Xmx1024M > > so not so little memory. > > Daniel > > On Wed, Jun 13, 2012 at 4:28 PM, Daniel Brügge < > daniel.brue...@googlemail.com> wrote: > > > Hi, > > > > am struggling around with creating multiple collections on a 4 instances > > SolrCloud > > setup: > > > > I have 4 virtual OpenVZ instances, where I have installed SolrCloud on > > each and > > on one is also a standalone Zookeeper running. > > > > Loading the Solr configuration into ZK works fine. > > > > Then I startup the 4 instances and everything is also running smoothly. > > > > After that I am adding one core with the name e.g. '123'. > > > > This core is correctly visible on the instance I have used for creating > > it. > > > > it maps like > > > > '123' > shard1 -> virtual-instance-1 > > > > > > After that I am creating a core with the same name '123' on the second > > instance and it > > creates it, but an exception is thrown after some while and the cluster > > state of > > the newly created core goes to 'recovering' > > > > > > *"123":{"shard1":{ > > "virtual-instance-1:8983_solr_123":{ > > "shard":"shard1", > > "roles":null, > > "leader":"true", > > "state":"active", > > "core":"123", > > "collection":"123", > > "node_name":"virtual-instance-1:8983_solr", > > "base_url":"http://virtual-instance-1:8983/solr"}, > > "**virtual-instance-2**:8983_solr_123":{* > > *"shard":"shard1", > > "roles":null, > > "state":"recovering", > > "core":"123", > > "collection":"123", > > "node_name":"virtual-instance-2:8983_solr", > > "base_url":"http://virtual-instance-2:8983/solr"}}},* > > > > > > The exception throws is on the first virtual instance: > > > > *Jun 13, 2012 2:18:40 PM org.apache.solr.common.SolrException log* > > *SEVERE: null:org.apache.lucene.store.LockObtainFailedException: Lock > > obtain timed out: NativeFSLock@/home/myuser/data/index/write.lock* > > * at org.apache.lucene.store.Lock.obtain(Lock.java:84)* > > * at org.apache.lucene.index.IndexWriter.(IndexWriter.java:607)* > > * at > > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:58)* > > * at > > > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:112) > > * > > * at > > > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:52) > > * > > * at > > > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:364) > > * > > * at > > > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) > > * > > * at > > > org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) > > * > > * at > > > org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) > > * > > * at > > > org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) > > * > > * at > > > org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) > > * > > * at > > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > > * > > * at > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > > * > > * at org.apache.solr.core.SolrCore.execute(SolrCore.java:1566)* > > * at > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442) > > * > > * at > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263) > > * > > * at > > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) > > * > > * at > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484) > > * > > * at > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) > > * > > * at > > > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524) > > * > > * at > > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) > > * > > * at > > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065) > > * > > * at > > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)* > > * at > > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192) > > * > > * at > > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999) > > * > > * at > > > org.eclipse.jetty.ser
Re: LockObtainFailedException after trying to create cores on second SolrCloud instance
What command are you using to create the cores? I had this sort of problem, and it was because I'd accidentally created two cores with the same instanceDir within the same SOLR process. Make sure you don't have that kind of collision. The easiest way is to specify an explicit instanceDir and dataDir. Best, Casey Callendrello On 6/13/12 7:28 AM, Daniel Brügge wrote: > Hi, > > am struggling around with creating multiple collections on a 4 instances > SolrCloud > setup: > > I have 4 virtual OpenVZ instances, where I have installed SolrCloud on each > and > on one is also a standalone Zookeeper running. > > Loading the Solr configuration into ZK works fine. > > Then I startup the 4 instances and everything is also running smoothly. > > After that I am adding one core with the name e.g. '123'. > > This core is correctly visible on the instance I have used for creating it. > > it maps like > > '123' > shard1 -> virtual-instance-1 > > > After that I am creating a core with the same name '123' on the second > instance and it > creates it, but an exception is thrown after some while and the cluster > state of > the newly created core goes to 'recovering' > > > *"123":{"shard1":{ > "virtual-instance-1:8983_solr_123":{ > "shard":"shard1", > "roles":null, > "leader":"true", > "state":"active", > "core":"123", > "collection":"123", > "node_name":"virtual-instance-1:8983_solr", > "base_url":"http://virtual-instance-1:8983/solr"}, > "**virtual-instance-2**:8983_solr_123":{* > *"shard":"shard1", > "roles":null, > "state":"recovering", > "core":"123", > "collection":"123", > "node_name":"virtual-instance-2:8983_solr", > "base_url":"http://virtual-instance-2:8983/solr"}}},* > > > The exception throws is on the first virtual instance: > > *Jun 13, 2012 2:18:40 PM org.apache.solr.common.SolrException log* > *SEVERE: null:org.apache.lucene.store.LockObtainFailedException: Lock > obtain timed out: NativeFSLock@/home/myuser/data/index/write.lock* > * at org.apache.lucene.store.Lock.obtain(Lock.java:84)* > * at org.apache.lucene.index.IndexWriter.(IndexWriter.java:607)* > * at org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:58)* > * at > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:112) > * > * at > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:52) > * > * at > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:364) > * > * at > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) > * > * at > org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) > * > * at > org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) > * > * at > org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) > * > * at > org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) > * > * at > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > * > * at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > * > * at org.apache.solr.core.SolrCore.execute(SolrCore.java:1566)* > * at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442) > * > * at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263) > * > * at > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) > * > * at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484)* > * at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119) > * > * at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:524)* > * at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:233) > * > * at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1065) > * > * at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:413)* > * at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:192) > * > * at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:999) > * > * at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117) > * > * at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:250) > * > * at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:149) > * > * at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:111) > * > * at org.eclipse.jetty.server.Server.handle(Server.java:351)* > * at > org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(
Re: Getting maximum / minimum field value - slow query
Try the query without the sort to get the number of rows, then do a second query using a "start" equal to the number of rows. That should get you the last row/document. -- Jack Krupansky -Original Message- From: rafal.gwizd...@gmail.com Sent: Wednesday, June 13, 2012 3:07 PM To: solr-user@lucene.apache.org Subject: Getting maximum / minimum field value - slow query Hi, I have an index with about 9 millions of documents. Every document has an integer 'Id' field (it's not the SOLR document identifier) and I want to get the maximum value of that field. Therefore I'm doing a search with the following parameters query=*.*, sort=Id desc, rows=1 0 2672 *:* 1 Id desc CRQIncident#45165891 The problem is that it takes quite a long time to get the response (2-10 seconds). Why is it so slow - isn't it a simple index lookup? Best regards RG -- View this message in context: http://lucene.472066.n3.nabble.com/Getting-maximum-minimum-field-value-slow-query-tp3989467.html Sent from the Solr - User mailing list archive at Nabble.com.
Solr1.4 and threads ....
We've got a tokenizer which is quite explicitly coded on the assumption that it will only be called from one thread at a time. After all, what would it mean for two threads to make interleaved calls to the hasNext() function()? Yet, a customer of ours with a gigantic instance of Solr 1.4 reports incidents in which we throw an exception that indicates (we think), that two different threads made interleaved calls. Does this suggest anything to anyone? Other than that we've misanalyzed the logic in the tokenizer and there's a way to make it burp on one thread?
Re: Sharding in SolrCloud
Hmmm, are you sure SolrCloud fits your needs? You say that you think everything will fit on one shard and are worried about bulk updates. In that case I should think regular Solr master/slave (rather than cloud) might be a better fit. Using Cloud and all that goes with it for a single shard is certainly possible, but I question whether it's your best option here Of course if NRT is a requirement, then SolrCloud is a much better option With typical master/slave setups, since your bulk updates are happening on a separate machine, having multiple slaves that query at a given interval seems like it would work, but you'd have to be able to stand, say, 5-10 minute latency... Best Erick On Wed, Jun 13, 2012 at 7:47 AM, wrote: > Mark Miller schrieb am 12.06.2012 19:19:01: >> >> >> On Jun 12, 2012, at 3:39 AM, lenz...@gfi.ihk.de wrote: >> >> > Hello, >> > >> > we tested SolrCloud in a setup with one collection, two shards and one > >> > replica per shard and it works quite fine with some example data. >> > Now, we plan to set up our own collection and determine in how many > shards >> > we should devide it. >> > We can estimate quite exactly the size of the collection, but we don't > >> > know, what the best approach for sharding is, >> > even if we know the size and the amount of queries and updates. >> > Is there any documentation or a kind of design guidelines for sharding > a >> > collection in SolrCloud? >> > >> > >> > Thanks & regards, >> > Norman Lenzner >> >> >> It's hard to tell - I think you want to start with an idea of how >> many docs you can fit on a single node. This can vary wildly >> depending on many factors. Generally you have to do some testing >> with your particular config and data. You can search the mailing >> lists and perhaps dig up a little info, but there is really no >> replacement for running some tests with real data. >> >> Then you have to plan in your growth rate - resharding is naturally >> a relatively expensive operation. Once you have an idea of how many >> docs per machine you think seems comfortable, figure out how >> machines you need given your estimated doc growth rate and perhaps >> some padding. You might not get it right, but if you expect the >> possibility of a lot of growth, erring on the more shards side is >> obviously better. >> >> - Mark Miller >> lucidimagination.com >> > > Hello and thanks for your reply, > > We will run some tests to determine the size of our collection, but I > think, there > won't be the need of a second shard at all. The problem is not the size or > the growth of > the docs, but there will be a quite high update frequency. So, if we have > many bulk updates, is > it reasonable to distribute the update load on multiple shards? > > Thanks & regards, > Norman Lenzner
Re: FilterCache - maximum size of document set
Hmmm, I think you may be looking at the wrong thing here. Generally, a filterCache entry will be maxDocs/8 (plus some overhead), so in your case they really shouldn't be all that large, on the order of 3M/filter. That shouldn't vary based on the number of docs that match the fq, it's just a bitset. To see if that makes any sense, take a look at the admin page and the number of evictions in your filterCache. If that is > 0, you're probably using all the memory you're going to in the filterCache during the day.. But you haven't indicated what version of Solr you're using, I'm going from a relatively recent 3x knowledge-base. Have you put a memory analyzer against your Solr instance to see where the memory is being used? Best Erick On Wed, Jun 13, 2012 at 1:05 PM, Pawel wrote: > Hi, > I have solr index with about 25M documents. I optimized FilterCache size to > reach the best performance (considering traffic characteristic that my Solr > handles). I see that the only way to limit size of a Filter Cace is to set > number of document sets that Solr can cache. There is no way to set memory > limit (eg. 2GB, 4GB or something like that). When I process a standard > trafiic (during day) everything is fine. But when Solr handle night traffic > (and the charateristic of requests change) some problems appear. There is > JVM out of memory error. I know what is the reason. Some filters on some > fields are quite poor filters. They returns 15M of documents or even more. > You could say 'Just put that into q'. I tried to put that filters into > "Query" part but then, the statistics of request processing time (during > day) become much worse. Reduction of Filter Cache maxSize is also not good > solution because during day cache filters are very very helpful. > You could be interested in type of filters that I use. These are range > filters (I tried standard range filters and frange) - eg. price:[* TO > 1]. Some fq with price can return few thousands of results (eg. > price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions of > documents. I'd also like to avoid solution which will introduce strict > ranges that user can choose. > Have you any suggestions what can I do? Is there any way to limit for > example maximum size of docSet which is cached in FilterCache? > > -- > Pawel
Re: Solr1.4 and threads ....
On Wed, Jun 13, 2012 at 4:38 PM, Benson Margulies wrote: > > Does this suggest anything to anyone? Other than that we've > misanalyzed the logic in the tokenizer and there's a way to make it > burp on one thread? it might suggest the different tokenstream instances refer to some shared object that is not thread safe: we had bugs like this before (e.g. sharing a JDK collator is ok, but ICU ones are not thread-safe, so you must clone them). Because of this we beefed up our base analysis class (http://svn.apache.org/repos/asf/lucene/dev/branches/lucene_solr_3_6/lucene/test-framework/src/java/org/apache/lucene/analysis/BaseTokenStreamTestCase.java) to find thread safety bugs like this. I recommend just grabbing the test-framework.jar (we release it as an artifact), extend that class and write a test like: public void testRandomStrings() throws Exception { checkRandomData(random, analyzer, 10); } (or use the one in the branch, its even been improved since 3.6) -- lucidimagination.com
Re: Getting maximum / minimum field value - slow query
A large start value is probably worse performing than the sort (see SOLR-1726). Once the sort field is cached, it'll be quick from then on. Put in a warming query in solrconfig for new and/or firstSearcher that does this sort and the cache will be built in advance of queries at least. Erik On Jun 13, 2012, at 16:09 , Jack Krupansky wrote: > Try the query without the sort to get the number of rows, then do a second > query using a "start" equal to the number of rows. That should get you the > last row/document. > > -- Jack Krupansky > > -Original Message- From: rafal.gwizd...@gmail.com > Sent: Wednesday, June 13, 2012 3:07 PM > To: solr-user@lucene.apache.org > Subject: Getting maximum / minimum field value - slow query > > Hi, I have an index with about 9 millions of documents. Every document has an > integer 'Id' field (it's not the SOLR document identifier) and I want to get > the maximum value of that field. > Therefore I'm doing a search with the following parameters > query=*.*, sort=Id desc, rows=1 > > > > 0 > 2672 > > *:* > 1 > Id desc > > > > > CRQIncident#45165891 > > > > > The problem is that it takes quite a long time to get the response (2-10 > seconds). Why is it so slow - isn't it a simple index lookup? > > Best regards > RG > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Getting-maximum-minimum-field-value-slow-query-tp3989467.html > Sent from the Solr - User mailing list archive at Nabble.com.
Re: FilterCache - maximum size of document set
Thanks for your response Yes, maybe you are right. I thought that filters can be larger than 3M. All kinds of filters uses BitSet? Moreover maxSize of filterCache is set to 16000 in my case. There are evictions during day traffic but not during night traffic. Version of Solr which I use is 3.5 I haven't used Memory Anayzer yet. Could you write more details about it? -- Regards, Pawel On Wed, Jun 13, 2012 at 10:55 PM, Erick Erickson wrote: > Hmmm, I think you may be looking at the wrong thing here. Generally, a > filterCache > entry will be maxDocs/8 (plus some overhead), so in your case they really > shouldn't be all that large, on the order of 3M/filter. That shouldn't > vary based > on the number of docs that match the fq, it's just a bitset. To see if > that makes any > sense, take a look at the admin page and the number of evictions in > your filterCache. If > that is > 0, you're probably using all the memory you're going to in > the filterCache during > the day.. > > But you haven't indicated what version of Solr you're using, I'm going > from a > relatively recent 3x knowledge-base. > > Have you put a memory analyzer against your Solr instance to see where > the memory > is being used? > > Best > Erick > > On Wed, Jun 13, 2012 at 1:05 PM, Pawel wrote: > > Hi, > > I have solr index with about 25M documents. I optimized FilterCache size > to > > reach the best performance (considering traffic characteristic that my > Solr > > handles). I see that the only way to limit size of a Filter Cace is to > set > > number of document sets that Solr can cache. There is no way to set > memory > > limit (eg. 2GB, 4GB or something like that). When I process a standard > > trafiic (during day) everything is fine. But when Solr handle night > traffic > > (and the charateristic of requests change) some problems appear. There is > > JVM out of memory error. I know what is the reason. Some filters on some > > fields are quite poor filters. They returns 15M of documents or even > more. > > You could say 'Just put that into q'. I tried to put that filters into > > "Query" part but then, the statistics of request processing time (during > > day) become much worse. Reduction of Filter Cache maxSize is also not > good > > solution because during day cache filters are very very helpful. > > You could be interested in type of filters that I use. These are range > > filters (I tried standard range filters and frange) - eg. price:[* TO > > 1]. Some fq with price can return few thousands of results (eg. > > price:[40 TO 50]), but some (eg. price:[* TO 1]) can return milions > of > > documents. I'd also like to avoid solution which will introduce strict > > ranges that user can choose. > > Have you any suggestions what can I do? Is there any way to limit for > > example maximum size of docSet which is cached in FilterCache? > > > > -- > > Pawel >
Regarding number of documents
Hi, I have a data config file that contains the data import query. If I just run the import query against MySQL, I get a certain number of results. I assume that if I run the full-import, I should get the same number of documents added to the index, but I see that it's not the case and the number of documents added to the index are less than what I see from the MySQL query result. Can any one tell me if my assumption is correct and why the number of documents would be off? Thanks, Swetha
Re: Regarding number of documents
Note: I don't see any errors in the logs when I run the index. On Wed, Jun 13, 2012 at 5:48 PM, Swetha Shenoy wrote: > Hi, > > I have a data config file that contains the data import query. If I just > run the import query against MySQL, I get a certain number of results. I > assume that if I run the full-import, I should get the same number of > documents added to the index, but I see that it's not the case and the > number of documents added to the index are less than what I see from the > MySQL query result. Can any one tell me if my assumption is correct and why > the number of documents would be off? > > Thanks, > Swetha >
Re: Regarding number of documents
Could it be that you are getting records that are not unique. If so then SOLR would just overwrite the non unique documents. Thanks Afroz On Wed, Jun 13, 2012 at 4:50 PM, Swetha Shenoy wrote: > Note: I don't see any errors in the logs when I run the index. > > On Wed, Jun 13, 2012 at 5:48 PM, Swetha Shenoy wrote: > > > Hi, > > > > I have a data config file that contains the data import query. If I just > > run the import query against MySQL, I get a certain number of results. I > > assume that if I run the full-import, I should get the same number of > > documents added to the index, but I see that it's not the case and the > > number of documents added to the index are less than what I see from the > > MySQL query result. Can any one tell me if my assumption is correct and > why > > the number of documents would be off? > > > > Thanks, > > Swetha > > >
Re: Regarding number of documents
That makes sense. But I added a new entry that showed up in the MySQL results and not in the Solr search results. The count of documents also did not increase after the addition. How can a new entry show up in MySQL results and not as a new document? On Wed, Jun 13, 2012 at 6:26 PM, Afroz Ahmad wrote: > Could it be that you are getting records that are not unique. If so then > SOLR would just overwrite the non unique documents. > > Thanks > Afroz > > On Wed, Jun 13, 2012 at 4:50 PM, Swetha Shenoy wrote: > > > Note: I don't see any errors in the logs when I run the index. > > > > On Wed, Jun 13, 2012 at 5:48 PM, Swetha Shenoy > wrote: > > > > > Hi, > > > > > > I have a data config file that contains the data import query. If I > just > > > run the import query against MySQL, I get a certain number of results. > I > > > assume that if I run the full-import, I should get the same number of > > > documents added to the index, but I see that it's not the case and the > > > number of documents added to the index are less than what I see from > the > > > MySQL query result. Can any one tell me if my assumption is correct and > > why > > > the number of documents would be off? > > > > > > Thanks, > > > Swetha > > > > > >
Re: Regarding number of documents
Check the ID for that latest record and try to query it in Solr. One way you can get multiple records in an RDBMS query is via join. In that case, each of the records could have the same value in the column(s) that you are using for your unique key field in Solr. -- Jack Krupansky -Original Message- From: Swetha Shenoy Sent: Wednesday, June 13, 2012 7:21 PM To: solr-user@lucene.apache.org Subject: Re: Regarding number of documents That makes sense. But I added a new entry that showed up in the MySQL results and not in the Solr search results. The count of documents also did not increase after the addition. How can a new entry show up in MySQL results and not as a new document? On Wed, Jun 13, 2012 at 6:26 PM, Afroz Ahmad wrote: Could it be that you are getting records that are not unique. If so then SOLR would just overwrite the non unique documents. Thanks Afroz On Wed, Jun 13, 2012 at 4:50 PM, Swetha Shenoy wrote: > Note: I don't see any errors in the logs when I run the index. > > On Wed, Jun 13, 2012 at 5:48 PM, Swetha Shenoy wrote: > > > Hi, > > > > I have a data config file that contains the data import query. If I just > > run the import query against MySQL, I get a certain number of results. I > > assume that if I run the full-import, I should get the same number of > > documents added to the index, but I see that it's not the case and the > > number of documents added to the index are less than what I see from the > > MySQL query result. Can any one tell me if my assumption is correct > > and > why > > the number of documents would be off? > > > > Thanks, > > Swetha > > >
Re: Regarding number of documents
On 14 June 2012 04:51, Swetha Shenoy wrote: > That makes sense. But I added a new entry that showed up in the MySQL > results and not in the Solr search results. The count of documents also did > not increase after the addition. How can a new entry show up in MySQL > results and not as a new document? Sorry, but this is not very clear: Are you running a full-import, or a delta-import after adding the new entry in mysql? By any chance, does the new entry have an ID that already exists in the Solr index? What is the number of records that DIH reports after an import is completed? Regards, Gora
Re: Unexpected DIH behavior for onError attribute
On 13 June 2012 10:45, Pranav Prakash wrote: > My DIH Config file goes as follows. We have two db hosts, one of which > contains blocks of content and the other contain transcripts of those > content blocks. The makeDynamicTranscript function is used to create row > names like transcript_en, transcript_es and so on, which are dynamic fields > in Solr with appropriate tokenizers. [...] This looks fine. Have you looked in the Solr logs for more information? Is it possible that the error is causing some connection issue? What is the error exactly, and is it happening on the SELECT in the inner entity, or on the outer one? Regards, Gora
Re: LockObtainFailedException after trying to create cores on second SolrCloud instance
Will check later to use different data dirs for the core on each instance. But because each Solr sits in it's own openvz instance (virtual server respectively) they should be totally separated. At least from my point of understanding virtualization. Will check and get back here... Thanks. On Wed, Jun 13, 2012 at 8:10 PM, Mark Miller wrote: > Thats an interesting data dir location: NativeFSLock@/home/myuser/ > data/index/write.lock > > Where are the other data dirs located? Are you sharing one drive or > something? It looks like something already has a writer lock - are you sure > another solr instance is not running somehow? > > On Wed, Jun 13, 2012 at 11:11 AM, Daniel Brügge < > daniel.brue...@googlemail.com> wrote: > > > BTW: i am running the solr instances using -Xms512M -Xmx1024M > > > > so not so little memory. > > > > Daniel > > > > On Wed, Jun 13, 2012 at 4:28 PM, Daniel Brügge < > > daniel.brue...@googlemail.com> wrote: > > > > > Hi, > > > > > > am struggling around with creating multiple collections on a 4 > instances > > > SolrCloud > > > setup: > > > > > > I have 4 virtual OpenVZ instances, where I have installed SolrCloud on > > > each and > > > on one is also a standalone Zookeeper running. > > > > > > Loading the Solr configuration into ZK works fine. > > > > > > Then I startup the 4 instances and everything is also running smoothly. > > > > > > After that I am adding one core with the name e.g. '123'. > > > > > > This core is correctly visible on the instance I have used for creating > > > it. > > > > > > it maps like > > > > > > '123' > shard1 -> virtual-instance-1 > > > > > > > > > After that I am creating a core with the same name '123' on the second > > > instance and it > > > creates it, but an exception is thrown after some while and the cluster > > > state of > > > the newly created core goes to 'recovering' > > > > > > > > > *"123":{"shard1":{ > > > "virtual-instance-1:8983_solr_123":{ > > > "shard":"shard1", > > > "roles":null, > > > "leader":"true", > > > "state":"active", > > > "core":"123", > > > "collection":"123", > > > "node_name":"virtual-instance-1:8983_solr", > > > "base_url":"http://virtual-instance-1:8983/solr"}, > > > "**virtual-instance-2**:8983_solr_123":{* > > > *"shard":"shard1", > > > "roles":null, > > > "state":"recovering", > > > "core":"123", > > > "collection":"123", > > > "node_name":"virtual-instance-2:8983_solr", > > > "base_url":"http://virtual-instance-2:8983/solr"}}},* > > > > > > > > > The exception throws is on the first virtual instance: > > > > > > *Jun 13, 2012 2:18:40 PM org.apache.solr.common.SolrException log* > > > *SEVERE: null:org.apache.lucene.store.LockObtainFailedException: Lock > > > obtain timed out: NativeFSLock@/home/myuser/data/index/write.lock* > > > * at org.apache.lucene.store.Lock.obtain(Lock.java:84)* > > > * at org.apache.lucene.index.IndexWriter.(IndexWriter.java:607)* > > > * at > > > org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:58)* > > > * at > > > > > > org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:112) > > > * > > > * at > > > > > > org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:52) > > > * > > > * at > > > > > > org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:364) > > > * > > > * at > > > > > > org.apache.solr.update.processor.RunUpdateProcessor.processCommit(RunUpdateProcessorFactory.java:82) > > > * > > > * at > > > > > > org.apache.solr.update.processor.UpdateRequestProcessor.processCommit(UpdateRequestProcessor.java:64) > > > * > > > * at > > > > > > org.apache.solr.update.processor.DistributedUpdateProcessor.processCommit(DistributedUpdateProcessor.java:919) > > > * > > > * at > > > > > > org.apache.solr.update.processor.LogUpdateProcessor.processCommit(LogUpdateProcessorFactory.java:154) > > > * > > > * at > > > > > > org.apache.solr.handler.RequestHandlerUtils.handleCommit(RequestHandlerUtils.java:69) > > > * > > > * at > > > > > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68) > > > * > > > * at > > > > > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) > > > * > > > * at org.apache.solr.core.SolrCore.execute(SolrCore.java:1566)* > > > * at > > > > > > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:442) > > > * > > > * at > > > > > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:263) > > > * > > > * at > > > > > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1337) > > > * > > > * at > > > > > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:484) > > > * > > > * at > > > > > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHand