Issues With Solr Cloud Setup
Hi, I was going through solr cloud documentation ( http://wiki.apache.org/solr/SolrCloud). and was trying out a simple setup. I could execute example A and B successfully. However whn i try example C, i get the following exception: 2010 12:22:29 PM org.apache.log4j.Category warn WARNING: Exception closing session 0x0 to sun.nio.ch.selectionkeyi...@1385660 java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574) at org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933) May 14, 2010 12:22:29 PM org.apache.log4j.Category warn WARNING: Ignoring exception during shutdown input java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638) at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360) at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java):999) at org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970) May 14, 2010 12:22:29 PM org.apache.log4j.Category warn Probably this exception is because i hv not set localhost in my etc/hosts (still wonder how the 1st two worked). I tried replacing localhost with my domain name (java -Dbootstrap_confdir=./solr/conf -Dcollection.configName=myconf -DzkRun -DzkHost=raakhi:9983,raakhi:8574,raakhi:9900 -jar start.jar) and i get the following exception: java.lang.IllegalArgumentException: solr/zoo_data/myid file is missing at org.apache.solr.cloud.SolrZkServerProps.parseProperties(SolrZkServer.java:453) at org.apache.solr.cloud.SolrZkServer.parseConfig(SolrZkServer.java:83) at org.apache.solr.core.CoreContainer.initZooKeeper(CoreContainer.java:109) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:344) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:298) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:213) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:88) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Am i going wrong somewhere? Regards, Raakhi
Facet Queries
Hi, whn i use facet queries, whats the default size of the results returned? how do we configure if we want all the results shown? Regards Raakhi
Re: Facet Queries
Hey, there´s plenty of documentation about that... http://wiki.apache.org/solr/SimpleFacetParameters#Field_Value_Faceting_Parameters On Fri, May 14, 2010 at 10:38 AM, Rakhi Khatwani wrote: > Hi, >whn i use facet queries, whats the default size of the results > returned? how do we configure if we want all the results shown? > > Regards > Raakhi >
disable caches in real time
Hi, I want to know if there is any approach to disable caches in a specific core from a multicore server. My situation is the next: I have a multicore server where the core0 will be listen to the queries and other core (core1) that will be replicated from a master server. Once the replication has been done, i will swap the cores. My point is that i want to disable the caches in the core that is in charge of the replication to save memory in the machine. Any suggestions will be appreciated. Thanks in advance, Marco Martínez Bautista http://www.paradigmatecnologico.com Avenida de Europa, 26. Ática 5. 3ª Planta 28224 Pozuelo de Alarcón Tel.: 91 352 59 42
Re: Facet Queries
Hi, Thanks a lot...had a look @ tht... it solved my problem Thanks once again Regards Raakhi On Fri, May 14, 2010 at 2:13 PM, Leonardo Menezes < leonardo.menez...@googlemail.com> wrote: > Hey, >there´s plenty of documentation about that... > > http://wiki.apache.org/solr/SimpleFacetParameters#Field_Value_Faceting_Parameters > > On Fri, May 14, 2010 at 10:38 AM, Rakhi Khatwani >wrote: > > > Hi, > >whn i use facet queries, whats the default size of the results > > returned? how do we configure if we want all the results shown? > > > > Regards > > Raakhi > > >
Re: SolrUser - ERROR:SCHEMA-INDEX-MISMATCH
Thanks for the helps. The field is just for filter my data. They are: client_id, instance_id. When i index my data, i put de identifier of client (Because my application is a multiclient). When i search in solr, i wanna to find the docs where client_id:1, as example. Put the field as string, this works. When i see that i can put the field as long, i think that's could be a best practice. But my trouble is i have many docs indexeds. How to change to long now is a bad idea, i will mantain the field in string type. (Correct me if i am wrong) Thanks 2010/5/13 Erick Erickson > This is probably a bad idea. You're getting by on backwards > compatibility stuff, I'd really recommend that you reindex your > entire corpus, possibly getting by on what you already have > until you can successfully reindex. > > Have a look at trie fields (this is detailed in the example > schema.xml). Here's another place to look: > > http://www.lucidimagination.com/blog/2009/05/13/exploring-lucene-and-solrs-trierange-capabilities/ > > You also haven't told us what you want to do with > the field, so making recommendations is difficult. > > Best > Erick > > On Thu, May 13, 2010 at 5:19 PM, Anderson vasconcelos < > anderson.v...@gmail.com> wrote: > > > Hi Erick. > > I put in my schema.xml fields with type string. The system go to te > > production, and now i see that the field must be a long field. > > > > When i change the fieldtype to long, show the error > > ERROR:SCHEMA-INDEX-MISMATCH when i search by solr admin. > > > > I Put "plong", and this works. This is the way that i must go on? (This > > could generate a trouble in the future?) > > > > What's the advantages to set the field type to long? I must mantain this > > field in string type? > > > > Thanks > > > > 2010/5/13 Erick Erickson > > > > > Not at present, you must re-index your documents when you redefine your > > > schema > > > to change existing documents. > > > > > > Field updating of documents already indexed is being worked on, but > it's > > > not > > > available yet. > > > > > > Best > > > Erick > > > > > > On Thu, May 13, 2010 at 3:58 PM, Anderson vasconcelos < > > > anderson.v...@gmail.com> wrote: > > > > > > > Hi All. > > > > > > > > I have the follow fields in my schema: > > > > > > > default="NEW"/> > > > > > > stored="true" > > > > required="true"/> > > > > > stored="true" > > > > required="true"/> > > > > > > > required="true"/> > > > > > multiValued="false" > > > > indexed="true" stored="true"/> > > > > > > > required="false"/> > > > > > > > required="false"/> > > > > > > > > I need to change the index of SOLR, adding a dynamic field that will > > > > contains all values of "value" field. Its possible to get all index > > data > > > > and > > > > reindex, putting the values on my dynamic field? > > > > > > > > How the data was no stored, i don't find one way to do this > > > > > > > > Thanks > > > > > > > > > >
Connection Pool
Hi I wanna to know if has any connection pool client to manage the connections with solr. In my system, we have a lot of concurrency index request. I cant shared my connection, i need to create one per transaction. But if i create one per transaction, i think the performance will down. How you resolve this problem? Thanks
Re: Connection Pool
On Fri, May 14, 2010 at 3:35 PM, Anderson vasconcelos wrote: > Hi > I wanna to know if has any connection pool client to manage the connections > with solr. In my system, we have a lot of concurrency index request. I cant > shared my connection, i need to create one per transaction. But if i create > one per transaction, i think the performance will down. > > How you resolve this problem? The commonsHttpSolrServer class does connection pooling, and IIRC also the StreamingUpdateSolrServer. -- blog en: http://www.riffraff.info blog it: http://riffraff.blogsome.com
Recommended MySQL JDBC driver
Which driver is the "best" for use with solr? I am currently using mysql-connector-java-5.1.12-bin.jar in my production setting. However I recently tried downgrading and did some quick indexing using mysql-connector-java-5.0.8-bin.jar and I close to a 2x improvement in speed!!! Unfortunately I kept getting the following error using the 5.0.8 version: "Caused by: com.mysql.jdbc.CommunicationsException: The last communications with the server was 474 seconds ago, which is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem." I tried setting the autoReconnect="true" in my datasource configuration but I keep getting the same error. Any ideas? -- View this message in context: http://lucene.472066.n3.nabble.com/Recommended-MySQL-JDBC-driver-tp817458p817458.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Recommended MySQL JDBC driver
I would like to know the same thing. I'm using 5.1.12 myself. A full reindex of one of my shards takes 4-6 hours for 7 million rows, depending on whether I run them one at a time or all at once. If I run the same query on the same machine with the commandline client and write the results to a file, it takes 10-15 minutes. I know that reindexing will never be as fast as a raw data dump, but it does seem like it could be faster. I had been using an older release (which I think was 5.1.7, but I can't remember) that was actually slower than this one. I haven't tried the 5.0.8 version myself. It is well over two years old, so I'm not surprised it's buggy, especially if everything else (mysql server, java, etc) is running recent versions. There is mention in the change log for 5.1.13 (not yet released) about a performance regression, but when I look at the bug and follow the references, it sounds like the regression was actually introduced by a change after 5.1.12, so it may not be any faster. Regarding the speed problem in 5.1.12, you could try a development snapshot on a system not in production, which I will also do on mine. If that doesn't improve the situation, consider filing a bug on the MySQL connector. http://downloads.mysql.com/snapshots.php As far as the error in the 5.0.8 version, does the import work, or does it fail when the exception is thrown? I have seen a lot of messages about autoreconnect not working for people, but no bug filed, and from what I can tell, autoreconnect does work, but throws the exception even when it's working. You might also try doing as it says and increasing the timeout on the server. Thanks, Shawn On 5/14/2010 8:56 AM, Blargy wrote: Which driver is the "best" for use with solr? I am currently using mysql-connector-java-5.1.12-bin.jar in my production setting. However I recently tried downgrading and did some quick indexing using mysql-connector-java-5.0.8-bin.jar and I close to a 2x improvement in speed!!! Unfortunately I kept getting the following error using the 5.0.8 version: "Caused by: com.mysql.jdbc.CommunicationsException: The last communications with the server was 474 seconds ago, which is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem." I tried setting the autoReconnect="true" in my datasource configuration but I keep getting the same error. Any ideas?
Re: Recommended MySQL JDBC driver
Shawn, first off thanks for the reply and links! "As far as the error in the 5.0.8 version, does the import work, or does it fail when the exception is thrown?" - The import "works" for about 5-10 minutes then it fails and everything is rolled-back one the above exception is thrown. " You might also try doing as it says and increasing the timeout on the server" - How is this accomplished? I tried "maxWait" options on the datasource in data-config.xml but that didn't seem to work. I'm also torn on whether or not I should file a bug that may or not exist. The whole reason I tried downgrading to 5.0.8 was due to the fact that during certain (not all) delta-imports I keep getting the following error which seems to be all mysql related: SEVERE: Delta Import Failed java.lang.StackOverflowError at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109) at java.net.SocketOutputStream.write(SocketOutputStream.java:153) at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126) at com.mysql.jdbc.MysqlIO.send(MysqlIO.java:3296) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1941) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2114) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2690) at com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1545) at com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:201) at com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7624) at com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:908) at com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2364) at com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1583) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4454) at com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1359) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2723) at com.mysql.jdbc.StatementImpl.executeSimpleNonQuery(StatementImpl.java:1545) at com.mysql.jdbc.RowDataDynamic.close(RowDataDynamic.java:201) at com.mysql.jdbc.ResultSetImpl.realClose(ResultSetImpl.java:7624) at com.mysql.jdbc.ResultSetImpl.close(ResultSetImpl.java:908) at com.mysql.jdbc.StatementImpl.realClose(StatementImpl.java:2364) at com.mysql.jdbc.ConnectionImpl.closeAllOpenStatements(ConnectionImpl.java:1583) at com.mysql.jdbc.ConnectionImpl.realClose(ConnectionImpl.java:4454) at com.mysql.jdbc.ConnectionImpl.cleanup(ConnectionImpl.java:1359) at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2723) and it keeps going Once the above exception occurs I can never delta-import again against that index. I am then forced to do a full-import. Do you have any thoughts or suggestions on that? Should I file this as a MySQL bug? Thanks again for your help. I'll try playing around with the latest versions of the connector and I'll post my results. -- View this message in context: http://lucene.472066.n3.nabble.com/Recommended-MySQL-JDBC-driver-tp817458p817790.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Recommended MySQL JDBC driver
Hi, just FYI I am using mysql-connector-java-5.1.10-bin.jar and I my full import takes about 3 hours and I am not experiencing crashes. regards, Lukas
Re: Recommended MySQL JDBC driver
Lucas.. was there a reason you went with 5.1.10 or was it just the latest when you started your Solr project? Also, how many items are in your index and how big is your index size? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Recommended-MySQL-JDBC-driver-tp817458p817855.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Recommended MySQL JDBC driver
On 14.05.2010, at 19:16, Blargy wrote: > > Lucas.. was there a reason you went with 5.1.10 or was it just the latest > when you started your Solr project? just what was recent when i set things up. > Also, how many items are in your index and how big is your index size? index size is 4.6GB with about 16M entities. regards, Lukas Kahwe Smith m...@pooteeweet.org
Re: Recommended MySQL JDBC driver
Lucas.. was there a reason you went with 5.1.10 or was it just the latest when you started your Solr project? just what was recent when i set things up. Also, how many items are in your index and how big is your index size? index size is 4.6GB with about 16M entities. I downgraded to 5.0.8 for testing. Initially, I thought it was going to be faster, but it slows down as it gets further into the index. It now looks like it's probably going to take the same amount of time. On the server timeout thing - that's a setting you'd have to put in my.ini or my.cfg, there may also be a way to change it on the fly without restarting the server. I suspect that when you are running a multiple query setup like yours, it opens multiple connections, and when one of them is busy doing some work, the others are idle. That may be related to the timeout with the older connector version. On my setup, I only have one query that retrieves records, so I'm probably not going to run into that. I could be wrong about how it works - you can confirm or refute this idea by looking at SHOW PROCESSLIST on your MySQL server while it's working. Each of my shards is reported by the replication handler at a little over 12GB. This is with 7 million entities. Shawn
How to tell which field matched?
All, I've searched around for help with something we are trying to do and haven't come across much. We are running solr 1.4. Here is a summary of the issue we are facing: A simplified example of our schema is something like this: When someone does a search we search across the title, supplement_title, and supplement_pdf_text fields. When we get our results, we would like to be able to tell which field the search matched and if it's a multiValued field, which of the multiple values matched. This is so that we can display results similar to: Example Title Example Supplement Title Example Supplement Title 2 (your search matched this document) Example Supplement Title 3 Example Title 2 Example Supplement Title 4 Example Supplement Title 5 Example Supplement Title 6 (your search matched this document) etc. How would you recommend doing this? Is there some way to get solr to tell us which field matched, including multiValued fields? As a workaround we have been using highlighting to tell which field matched, but it doesn't get us what we want for multiValued fields and there is a significant cost to enabling the highlighting. Should we design our schema in some other fashion to achieve these results? Thanks. -Tim
Re: How to tell which field matched?
Does the standard debug component (?debugQuery=on) give you what you need? http://wiki.apache.org/solr/SolrRelevancyFAQ#Why_does_id:archangel_come_before_id:hawkgirl_when_querying_for_.22wings.22 - Jon On May 14, 2010, at 4:03 PM, Tim Garton wrote: > All, > I've searched around for help with something we are trying to do > and haven't come across much. We are running solr 1.4. Here is a > summary of the issue we are facing: > > A simplified example of our schema is something like this: > >/> >required="true" /> > >stored="true" multiValued="true" /> >stored="true" multiValued="true" /> >stored="true" multiValued="true" /> > > When someone does a search we search across the title, > supplement_title, and supplement_pdf_text fields. When we get our > results, we would like to be able to tell which field the search > matched and if it's a multiValued field, which of the multiple values > matched. This is so that we can display results similar to: > >Example Title >Example Supplement Title >Example Supplement Title 2 (your search matched this document) >Example Supplement Title 3 > >Example Title 2 >Example Supplement Title 4 >Example Supplement Title 5 >Example Supplement Title 6 (your search matched this document) > >etc. > > How would you recommend doing this? Is there some way to get solr to > tell us which field matched, including multiValued fields? As a > workaround we have been using highlighting to tell which field > matched, but it doesn't get us what we want for multiValued fields and > there is a significant cost to enabling the highlighting. Should we > design our schema in some other fashion to achieve these results? > Thanks. > > -Tim
Autosuggest
What is the preferred way to implement this feature? Using facets or the terms component (or maybe something entirely different). Thanks in advance! -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p818430.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: bi-directional replication on solr 1.4?
: It looks like SnapPuller.java doesn't allow for the possibility of the : slave having a later index version than the master. It only checks : whether the versions are equal. : : It's easy enough to add that check and prevent the index fetch when : the slave has a later version (in fact I'm running it in a sandbox I'm not 100% positive, but i believe a change like that could cause problems if the index on the master is completley rebuild from scratch. indexVersion is garunteed to increase as the index is modified, (ie: add or merge segments) but i think an entirely new index (ie: delete the entire index directory as deleteByQuery("*:*) does and then reindex) could concievably result i na new index with a lower indexVersion number then the index it replaces. Yonik / Miller: does the SolrCloud branch already have support for master failover in a situation like this (ie: a two node "cloud") ? -Hoss
Re: Autosuggest
Easiest and oldest is wildcards on facets. Next easiest and more useful is from spelling suggestions: but, might be slow. Terms component? I guess. It and facets allow limiting the database with searches. Using the spelling database does not allow this. On Fri, May 14, 2010 at 2:39 PM, Blargy wrote: > > What is the preferred way to implement this feature? Using facets or the > terms component (or maybe something entirely different). Thanks in advance! > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p818430.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Lance Norskog goks...@gmail.com
Re: Autosuggest
I think terms component is the recommended way. That's how we have implemented it and from the solr wiki page http://wiki.apache.org/solr/TermsComponent: ... each term. This can be useful for doing auto-suggest or other things that operate ... Have to be careful how you filter the field during indexing though, otherwise you can get weird suggestions based on latin roots (due to stemming stuff) among other things. We used something like: -Tim On Fri, May 14, 2010 at 2:39 PM, Blargy wrote: > > What is the preferred way to implement this feature? Using facets or the > terms component (or maybe something entirely different). Thanks in advance! > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p818430.html > Sent from the Solr - User mailing list archive at Nabble.com. >
Re: Autosuggest
"Easiest and oldest is wildcards on facets. " - Does this allow partial matching or is this only prefix matching? "It and facets allow limiting the database with searches. Using the spelling database does not allow this." - What do you mean? So there is no generally accepted preferred way to do auto-suggest? -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p818705.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Autosuggest
Thanks for your help and especially your analyzer.. probably saved me a full-import or two :) -- View this message in context: http://lucene.472066.n3.nabble.com/Autosuggest-tp818430p818712.html Sent from the Solr - User mailing list archive at Nabble.com.