Couple of problems
Hi, I have installed solr under a stand alone tomcat5.5 installation. I can see the admin screens etc. When I submit documents I get this error Oct 11, 2006 10:05:44 AM org.apache.solr.core.SolrException logSEVERE: java.lang.NullPointerException at org.apache.solr.update.DocumentBuilder.addField (DocumentBuilder.java:78) at org.apache.solr.update.DocumentBuilder.addField (DocumentBuilder.java:74) at org.apache.solr.core.SolrCore.readDoc(SolrCore.java:917) at org.apache.solr.core.SolrCore.update(SolrCore.java:685) at org.apache.solr.servlet.SolrUpdateServlet.doPost (SolrUpdateServlet.java:52) at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) . My docs follow this schema: stored="true"/> Also - since getting this error I can no longer see part of the solr/ admin/stats.jsp screen - the boxes core, update , cache and other are now empty. I deleted and reinstalled solr (including the unpacked webapps dir) but not tomcat and the problem is still there cheers mark
Re: Couple of problems
Check the tomcat logs... most probably there is a conflict with the field definitions in your schema.xml
Re: Couple of problems
Hi, there are no errors while reading the schema: Oct 11, 2006 9:56:43 AM org.apache.solr.schema.IndexSchema readConfig INFO: Reading Solr Schema Oct 11, 2006 9:56:43 AM org.apache.solr.schema.IndexSchema readConfig INFO: Schema name=archive Oct 11, 2006 9:56:43 AM org.apache.solr.schema.IndexSchema readConfig INFO: default search field is content Oct 11, 2006 9:56:43 AM org.apache.solr.schema.IndexSchema readConfig INFO: query parser default operator is OR Oct 11, 2006 9:56:43 AM org.apache.solr.servlet.SolrUpdateServlet init INFO: SolrUpdateServlet.init() done and then the first error is the one I reported when I submit a document I am looking in Catalina.out - are there any other logs I should look at? cheers mark On 11 Oct 2006, at 10:40, Panayiotis Papadopoulos wrote: Check the tomcat logs... most probably there is a conflict with the field definitions in your schema.xml
Re: Couple of problems
How do you post the documents to solr ? Via php, jsp or smth like that ? Then if u use curl from php or jsp or asp you can see the error that solr returns, in php using curl i found out the error using this... $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_TIMEOUT, 4); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $header); $data = curl_exec($ch); and then i printed $data, my schema was parsed successfully but actually in the xml i was using variables bit different than in schema plus there were some logical errors in the schema ... So try to find the SOLR runtime errors using a solution like above
Re: Couple of problems
that just returns the null pointer exception. I have checked my schema and doc: Schema: stored="true"/> Template: doc = """ %s %s %s %s %s %s """ On 11 Oct 2006, at 12:19, Panayiotis Papadopoulos wrote: How do you post the documents to solr ? Via php, jsp or smth like that ? Then if u use curl from php or jsp or asp you can see the error that solr returns, in php using curl i found out the error using this... $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_TIMEOUT, 4); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $header); $data = curl_exec($ch); and then i printed $data, my schema was parsed successfully but actually in the xml i was using variables bit different than in schema plus there were some logical errors in the schema ... So try to find the SOLR runtime errors using a solution like above
Invalid XML in response
Hi I don't understand why SOLR returns and invalid XML file as a response in case when we insert a document with a field that is not defined in the Solr configuration. Is there any purpose for that? It would be nice if it returns a valid xml regards Przemek Brzozowski ERROR:unknown field 'aaa'status="1">org.xmlpull.v1.XmlPullParserException: expected START_TAG or END_TAG not END_DOCUMENT (position: END_DOCUMENT seen ...\n\n... @9:1) at org.xmlpull.mxp1.MXParser.nextTag(MXParser.java:1083) at org.apache.solr.core.SolrCore.update(SolrCore.java:681) at org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:52) at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) at java.lang.Thread.run(Thread.java:595) -- Jestes kierowca? To poczytaj! >>> http://link.interia.pl/f199e
Re: Couple of problems
Are you ensuring that the %s replacements are properly encoded for XML? Erik On Oct 11, 2006, at 7:54 AM, mark wrote: that just returns the null pointer exception. I have checked my schema and doc: Schema: stored="true"/> stored="true"/> Template: doc = """ %s %s %s %s %s %s """ On 11 Oct 2006, at 12:19, Panayiotis Papadopoulos wrote: How do you post the documents to solr ? Via php, jsp or smth like that ? Then if u use curl from php or jsp or asp you can see the error that solr returns, in php using curl i found out the error using this... $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_TIMEOUT, 4); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $header); $data = curl_exec($ch); and then i printed $data, my schema was parsed successfully but actually in the xml i was using variables bit different than in schema plus there were some logical errors in the schema ... So try to find the SOLR runtime errors using a solution like above
Re: Couple of problems
I believe so - an earlier attempt did fail in that department but the result was an XML parsing error (as you might expect). On 11 Oct 2006, at 14:19, Erik Hatcher wrote: Are you ensuring that the %s replacements are properly encoded for XML? Erik On Oct 11, 2006, at 7:54 AM, mark wrote: that just returns the null pointer exception. I have checked my schema and doc: Schema: stored="true"/> stored="true"/> stored="true"/> Template: doc = """ %s %s %s %s %s %s """ On 11 Oct 2006, at 12:19, Panayiotis Papadopoulos wrote: How do you post the documents to solr ? Via php, jsp or smth like that ? Then if u use curl from php or jsp or asp you can see the error that solr returns, in php using curl i found out the error using this... $ch = curl_init(); curl_setopt($ch, CURLOPT_URL,$url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1); curl_setopt($ch, CURLOPT_TIMEOUT, 4); curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $header); $data = curl_exec($ch); and then i printed $data, my schema was parsed successfully but actually in the xml i was using variables bit different than in schema plus there were some logical errors in the schema ... So try to find the SOLR runtime errors using a solution like above
Solr use case
Hi all, Is it true that Solr is mainly used for applications that rarely change the underlying data? As I understand, if you submit new data or modify existing data on Solr server, you would have to "refresh" the cache somehow to display the updated data. If my application frequently gets new data/updates from users, should I use Solr? I love faceted browsing and dynamic properties so much but I need to justify the choice of Solr. Thanks. By the way, does anyone have any performance measure that can be shared (apart from the one on the Wiki)? As I estimated, my application probably has half a million docs, each of which has around 15 properties, does anyone know the type of hardware I would need for reasonable performance. Thanks. -- Regards, Cuong Hoang
Re: Couple of problems
Wow ... this is crazy looking ... as far as i can tell the only way to get an NPE at thta line is if the DocumentBuilder is being given a null IndexSchema when i'ts constructed. I don't know how that would happen. can you zip up your solr/conf (so we have the schema and the config) and post it online somehwere? : admin/stats.jsp screen - the boxes core, update , cache and other are : now empty. I deleted and reinstalled solr (including the unpacked : webapps dir) but not tomcat and the problem is still there that's really weird ... i suggeggests that the info registry is being emptied out ... you said this problem continued after re-installing, i assume you stoped/started the port as well? -Hoss
Re: Couple of problems
can you zip up your solr/conf (so we have the schema and the config) and post it online somehwere? http://www.pagefall.com/cnf.zip But this is a right out of the box install - I have only messed with the schema to suit me. It was a nightly build though that's really weird ... i suggeggests that the info registry is being emptied out ... you said this problem continued after re-installing, i assume you stoped/started the port as well? yep - really careful to check this - I made sure there was a 404 between stop and start cheers mark
Re: Solr use case
No, after you add new documents you simply issue a command and the new docs are searchable. On Discogs.com we have just over 1 million docs in the index and do about 20,000 updates per day. Every 15 minutes we read a queue and add new documents, then commit. And we optimize once per day. I've had no problems with that. Kevin On 10/11/06, climbingrose <[EMAIL PROTECTED]> wrote: Hi all, Is it true that Solr is mainly used for applications that rarely change the underlying data? As I understand, if you submit new data or modify existing data on Solr server, you would have to "refresh" the cache somehow to display the updated data. If my application frequently gets new data/updates from users, should I use Solr? I love faceted browsing and dynamic properties so much but I need to justify the choice of Solr. Thanks. By the way, does anyone have any performance measure that can be shared (apart from the one on the Wiki)? As I estimated, my application probably has half a million docs, each of which has around 15 properties, does anyone know the type of hardware I would need for reasonable performance. Thanks. -- Regards, Cuong Hoang
Re: Couple of problems
I've had a problem similar to this and it was because of the schema.xml. It was valid XML but there were some incorrect field definitions and/or the default field listed was not a defined field. I'd suggest you start with the default schema and build on it piece by piece, each time testing for the error with a "ping" operation in the admin page. Kevin On 10/11/06, mark <[EMAIL PROTECTED]> wrote: Hi, I have installed solr under a stand alone tomcat5.5 installation. I can see the admin screens etc. When I submit documents I get this error Oct 11, 2006 10:05:44 AM org.apache.solr.core.SolrException logSEVERE: java.lang.NullPointerException at org.apache.solr.update.DocumentBuilder.addField (DocumentBuilder.java:78) at org.apache.solr.update.DocumentBuilder.addField (DocumentBuilder.java:74) at org.apache.solr.core.SolrCore.readDoc(SolrCore.java:917) at org.apache.solr.core.SolrCore.update(SolrCore.java:685) at org.apache.solr.servlet.SolrUpdateServlet.doPost (SolrUpdateServlet.java:52) at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) . My docs follow this schema: Also - since getting this error I can no longer see part of the solr/ admin/stats.jsp screen - the boxes core, update , cache and other are now empty. I deleted and reinstalled solr (including the unpacked webapps dir) but not tomcat and the problem is still there cheers mark
QTime field in response XML
I've searched the docs but could not find an answer. Is this field microseconds or milliseconds? thanks, Kevin
Re: QTime field in response XML
Milliseconds. I'd be fairly skeptical about anybody doing reliable millisecond timings on a jvm! phil. On Oct 11, 2006, at 3:05 PM, Kevin Lewandowski wrote: I've searched the docs but could not find an answer. Is this field microseconds or milliseconds? thanks, Kevin -- Whirlycott Philip Jacob [EMAIL PROTECTED] http://www.whirlycott.com/phil/
Re: QTime field in response XML
On Oct 11, 2006, at 3:10 PM, WHIRLYCOTT wrote: Milliseconds. I'd be fairly skeptical about anybody doing reliable millisecond timings on a jvm! ^ Sorry, correcting myself. That should have been 'micro'. Timings are in milliseconds. phil. -- Whirlycott Philip Jacob [EMAIL PROTECTED] http://www.whirlycott.com/phil/
Re: Solr use case
On Oct 11, 2006, at 10:24 AM, climbingrose wrote: Is it true that Solr is mainly used for applications that rarely change the underlying data? No, not at all. Solr is very dynamic, and in fact shines even more than plain Lucene when the data changes frequently. As I understand, if you submit new data or modify existing data on Solr server, you would have to "refresh" the cache somehow to display the updated data. Solr manages this refresh automatically, and depending on how you have the caches configured the switchover to see new documents can be almost instantaneous. If my application frequently gets new data/updates from users, should I use Solr? Well, that is a difficult question to answer without knowing more about your architecture, but Solr certainly would not be a hindrance and in fact may just be what makes your search system shine! I love faceted browsing and dynamic properties so much but I need to justify the choice of Solr. Thanks. By the way, does anyone have any performance measure that can be shared (apart from the one on the Wiki)? As I estimated, my application probably has half a million docs, each of which has around 15 properties, does anyone know the type of hardware I would need for reasonable performance. I've gotten quite good response with a dataset of 500k documents on a MacBook Pro, with 1GB RAM. I've not done any measuring, other than to experience that the front-end (RoR) was more than responsive enough. Erik
Sorting
I need to sort a query two ways. Should I do the search one way: s.getDocListAndSet(query, restrictions, sort, req.getStart(), req.getLimit(), flags); then do the same search again with a different sort value or is there a method available to just sort the DocSet (like sortDocSet but it's protected) OR maybe it doesn't matter because caching will handle it anyway? Thanks
Re: Invalid XML in response
: I don't understand why SOLR returns and invalid XML file as a response : in case when we insert a document with a field that is not defined : in the Solr configuration. Is there any purpose for that? : : It would be nice if it returns a valid xml i think if you were adding only one doc, and it had a field problem, then the response would be valid XML ... but it looks like you are adding multiple docs, in which case even a success isn't valid XML at the moment... http://issues.apache.org/jira/browse/SOLR-2 ...can you verify that this is really the same bug (two seperate blocks because of adding two seperate docs in a single request) ... or is there something i'm missing in your example error besides that? : ERROR:unknown field 'aaa'org.xmlpull.v1.XmlPullParserException: expected START_TAG or : END_TAG not END_DOCUMENT (position: END_DOCUMENT seen : ...\n\n... @9:1) : at org.xmlpull.mxp1.MXParser.nextTag(MXParser.java:1083) :at org.apache.solr.core.SolrCore.update(SolrCore.java:681) : at : org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:52) : at javax.servlet.http.HttpServlet.service(HttpServlet.java:709) : at javax.servlet.http.HttpServlet.service(HttpServlet.java:802) :at : org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252) : at : org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173) :at : org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) : at : org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178) : at : org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126) : at : org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105) : at : org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107) : at : org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148) :at : org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869) : at : org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664) : at : org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527) : at : org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80) : at : org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684) : at java.lang.Thread.run(Thread.java:595) : : : : -- : Jestes kierowca? To poczytaj! >>> http://link.interia.pl/f199e : -Hoss
Re: Couple of problems
: But this is a right out of the box install - I have only messed with : the schema to suit me. when i use your schema with the current trunk using Jetty, right at startup my logs contain a "SolrException: Schema Parsing Failed" which is wrapping... Caused by: java.lang.RuntimeException: 'id' is not an indexed field:id{type=string,properties=stored} at org.apache.solr.schema.IndexSchema.getIndexedField(IndexSchema.java:192) at org.apache.solr.schema.IndexSchema.readConfig(IndexSchema.java:387) ... 21 more ...which is because if you want to use a uniqueKey field it must be indexed so deletes can be done. This didn't show up at all in your Tomcat logs on startup? or the first time you tried to do a search or an update? (it's in the SolrServlet.init method) -Hoss
Re: Sorting
: I need to sort a query two ways. Should I do the search one way: : s.getDocListAndSet(query, restrictions, sort, req.getStart(), : req.getLimit(), flags); : then do the same search again with a different sort value or is there a : method available to just sort the DocSet (like sortDocSet but it's : protected) : : OR maybe it doesn't matter because caching will handle it anyway? check this out from the example solrconfig.xml... true ...in those conditions, you should be able to just call getDocList (or getDocListAndSet) with your various Sort options and the cache will take care of everything. if you *do* want scores to be included in one of the Sorts, then i would try doing that search first using getDocListAndSet -- you can ignore the DocSet, but the next call to getDocList should leverage the filterCache, and the initial getDocListANdSet call hsould be faster then two seperate getDocList calls with different sorts... ...i think. -Hoss
Re: Sorting
Let me back up.. for a second. I want to create price ranges. I was thinking that I would do a search with a sort on price and create ranges by getting the document price every (docCount / #ofpricerangesIwant). Basically create: < 10, 10 - 60, 60 - 100 etc.. If the initial search wasn't sorted by price then I would have to do the second search just to figure out the price ranges. This was the only way I could think to do it. Maybe I'm going at this the wrong way? Thanks On 10/11/06, Chris Hostetter <[EMAIL PROTECTED]> wrote: : I need to sort a query two ways. Should I do the search one way: : s.getDocListAndSet(query, restrictions, sort, req.getStart(), : req.getLimit(), flags); : then do the same search again with a different sort value or is there a : method available to just sort the DocSet (like sortDocSet but it's : protected) : : OR maybe it doesn't matter because caching will handle it anyway? check this out from the example solrconfig.xml... true ...in those conditions, you should be able to just call getDocList (or getDocListAndSet) with your various Sort options and the cache will take care of everything. if you *do* want scores to be included in one of the Sorts, then i would try doing that search first using getDocListAndSet -- you can ignore the DocSet, but the next call to getDocList should leverage the filterCache, and the initial getDocListANdSet call hsould be faster then two seperate getDocList calls with different sorts... ...i think. -Hoss
Re: Couple of problems
This didn't show up at all in your Tomcat logs on startup? or the first time you tried to do a search or an update? (it's in the SolrServlet.init method) Nope - not at all. Hmm - thanks for finding problem though - will try it in a bit -Hoss