Couple of problems

2006-10-11 Thread mark

Hi,

I have installed solr under a stand alone tomcat5.5 installation. I  
can see the admin screens etc.


When I submit documents I get this error

Oct 11, 2006 10:05:44 AM org.apache.solr.core.SolrException  
logSEVERE: java.lang.NullPointerException
at org.apache.solr.update.DocumentBuilder.addField 
(DocumentBuilder.java:78)
at org.apache.solr.update.DocumentBuilder.addField 
(DocumentBuilder.java:74)

at org.apache.solr.core.SolrCore.readDoc(SolrCore.java:917)
 at org.apache.solr.core.SolrCore.update(SolrCore.java:685)
at org.apache.solr.servlet.SolrUpdateServlet.doPost 
(SolrUpdateServlet.java:52)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
.


My docs follow this schema:

 
   
   
   
   stored="true"/>

   
  


Also - since getting this error I can no longer see part of the solr/ 
admin/stats.jsp screen - the boxes core, update , cache and other are  
now empty. I deleted and reinstalled solr  (including the unpacked  
webapps dir) but not tomcat and the problem is still there


cheers

mark


Re: Couple of problems

2006-10-11 Thread Panayiotis Papadopoulos
Check the tomcat logs... most probably there is a conflict with the 
field definitions in your schema.xml


Re: Couple of problems

2006-10-11 Thread mark

Hi,

there are no errors while reading the schema:

Oct 11, 2006 9:56:43 AM org.apache.solr.schema.IndexSchema readConfig
INFO: Reading Solr Schema
Oct 11, 2006 9:56:43 AM org.apache.solr.schema.IndexSchema readConfig
INFO: Schema name=archive
Oct 11, 2006 9:56:43 AM org.apache.solr.schema.IndexSchema readConfig
INFO: default search field is content
Oct 11, 2006 9:56:43 AM org.apache.solr.schema.IndexSchema readConfig
INFO: query parser default operator is OR
Oct 11, 2006 9:56:43 AM org.apache.solr.servlet.SolrUpdateServlet init
INFO: SolrUpdateServlet.init() done

and then the first error is the one I reported when I submit a document

I am looking in Catalina.out - are there any other logs I should look  
at?


cheers

mark


On 11 Oct 2006, at 10:40, Panayiotis Papadopoulos wrote:

Check the tomcat logs... most probably there is a conflict with the  
field definitions in your schema.xml




Re: Couple of problems

2006-10-11 Thread Panayiotis Papadopoulos
How do you post the documents to solr ? Via php, jsp or smth like that ? 
Then if u use curl from php or jsp or asp you can see the error that 
solr returns,

in php using curl i found out the error using this...

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $header);   
$data = curl_exec($ch);
  

and then i printed $data, my schema was parsed successfully but actually 
in the xml i was using variables bit different than in schema plus there 
were some logical errors in the schema ...

So try to find the SOLR runtime errors using a solution like above


Re: Couple of problems

2006-10-11 Thread mark

that just returns the null pointer exception.

I have checked my schema and doc:

Schema:

  
   
   
   stored="true"/>

   

   

Template:

doc = """

%s
%s
%s
%s
%s
%s

"""


On 11 Oct 2006, at 12:19, Panayiotis Papadopoulos wrote:

How do you post the documents to solr ? Via php, jsp or smth like  
that ? Then if u use curl from php or jsp or asp you can see the  
error that solr returns,

in php using curl i found out the error using this...

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $header);   $data =  
curl_exec($ch);


and then i printed $data, my schema was parsed successfully but  
actually in the xml i was using variables bit different than in  
schema plus there were some logical errors in the schema ...

So try to find the SOLR runtime errors using a solution like above




Invalid XML in response

2006-10-11 Thread Przemysław Brzozowski

Hi
I don't understand why SOLR returns and invalid XML file as a response
in case when we insert a document with a field that is not defined
in the Solr configuration. Is there any purpose for that?

It would be nice if it returns a valid xml

regards
Przemek Brzozowski




ERROR:unknown field 'aaa'status="1">org.xmlpull.v1.XmlPullParserException: expected START_TAG or 
END_TAG not END_DOCUMENT (position: END_DOCUMENT seen 
...\n\n... @9:1)

   at org.xmlpull.mxp1.MXParser.nextTag(MXParser.java:1083)
  at org.apache.solr.core.SolrCore.update(SolrCore.java:681)
   at 
org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:52)

   at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
   at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
  at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
   at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
  at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
   at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
   at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
   at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
   at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
   at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
  at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
   at 
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
   at 
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
   at 
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
   at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)

at java.lang.Thread.run(Thread.java:595)



--
Jestes kierowca? To poczytaj! >>> http://link.interia.pl/f199e



Re: Couple of problems

2006-10-11 Thread Erik Hatcher

Are you ensuring that the %s replacements are properly encoded for XML?

Erik


On Oct 11, 2006, at 7:54 AM, mark wrote:


that just returns the null pointer exception.

I have checked my schema and doc:

Schema:

  
   stored="true"/>

   
   stored="true"/>

   

   

Template:

doc = """

%s
%s
%s
%s
%s
%s

"""


On 11 Oct 2006, at 12:19, Panayiotis Papadopoulos wrote:

How do you post the documents to solr ? Via php, jsp or smth like  
that ? Then if u use curl from php or jsp or asp you can see the  
error that solr returns,

in php using curl i found out the error using this...

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $header);   $data =  
curl_exec($ch);


and then i printed $data, my schema was parsed successfully but  
actually in the xml i was using variables bit different than in  
schema plus there were some logical errors in the schema ...

So try to find the SOLR runtime errors using a solution like above




Re: Couple of problems

2006-10-11 Thread mark
I believe so - an earlier attempt did fail in that department but the  
result was an XML parsing error (as you might expect).



On 11 Oct 2006, at 14:19, Erik Hatcher wrote:

Are you ensuring that the %s replacements are properly encoded for  
XML?


Erik


On Oct 11, 2006, at 7:54 AM, mark wrote:


that just returns the null pointer exception.

I have checked my schema and doc:

Schema:

  
   stored="true"/>

   
   stored="true"/>
   stored="true"/>


   

Template:

doc = """

%s
%s
%s
%s
%s
%s

"""


On 11 Oct 2006, at 12:19, Panayiotis Papadopoulos wrote:

How do you post the documents to solr ? Via php, jsp or smth like  
that ? Then if u use curl from php or jsp or asp you can see the  
error that solr returns,

in php using curl i found out the error using this...

$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 4);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, $header);   $data =  
curl_exec($ch);


and then i printed $data, my schema was parsed successfully but  
actually in the xml i was using variables bit different than in  
schema plus there were some logical errors in the schema ...

So try to find the SOLR runtime errors using a solution like above




Solr use case

2006-10-11 Thread climbingrose

Hi all,

Is it true that Solr is mainly used for applications that rarely change the
underlying data? As I understand, if you submit new data or modify existing
data on Solr server, you would have to "refresh" the cache somehow to
display the updated data. If my application frequently gets new data/updates
from users, should I use Solr? I love faceted browsing and dynamic
properties so much but I need to justify the choice of Solr. Thanks. By the
way, does anyone have any performance measure that can be shared (apart from
the one on the Wiki)? As I estimated, my application probably has half a
million docs, each of which has around 15 properties, does anyone know the
type of hardware I would need for reasonable performance.

Thanks.

--
Regards,

Cuong Hoang


Re: Couple of problems

2006-10-11 Thread Chris Hostetter

Wow ... this is crazy looking ... as far as i can tell the only way to get
an NPE at thta line is if the DocumentBuilder is being given a null
IndexSchema when i'ts constructed.  I don't know how that would happen.

can you zip up your solr/conf (so we have the schema and the config) and
post it online somehwere?

: admin/stats.jsp screen - the boxes core, update , cache and other are
: now empty. I deleted and reinstalled solr  (including the unpacked
: webapps dir) but not tomcat and the problem is still there

that's really weird ... i suggeggests that the info registry is being
emptied out ... you said this problem continued after re-installing, i
assume you stoped/started the port as well?



-Hoss



Re: Couple of problems

2006-10-11 Thread mark


can you zip up your solr/conf (so we have the schema and the  
config) and

post it online somehwere?


http://www.pagefall.com/cnf.zip

But this is a right out of the box install - I have only messed with  
the schema to suit me.


It was a nightly build though





that's really weird ... i suggeggests that the info registry is being
emptied out ... you said this problem continued after re-installing, i
assume you stoped/started the port as well?


yep - really careful to check this - I made sure there was a 404  
between stop and start


cheers

mark




Re: Solr use case

2006-10-11 Thread Kevin Lewandowski

No, after you add new documents you simply issue a  command
and the new docs are searchable.

On Discogs.com we have just over 1 million docs in the index and do
about 20,000 updates per day. Every 15 minutes we read a queue and add
new documents, then commit. And we optimize once per day. I've had no
problems with that.

Kevin

On 10/11/06, climbingrose <[EMAIL PROTECTED]> wrote:

Hi all,

Is it true that Solr is mainly used for applications that rarely change the
underlying data? As I understand, if you submit new data or modify existing
data on Solr server, you would have to "refresh" the cache somehow to
display the updated data. If my application frequently gets new data/updates
from users, should I use Solr? I love faceted browsing and dynamic
properties so much but I need to justify the choice of Solr. Thanks. By the
way, does anyone have any performance measure that can be shared (apart from
the one on the Wiki)? As I estimated, my application probably has half a
million docs, each of which has around 15 properties, does anyone know the
type of hardware I would need for reasonable performance.

Thanks.

--
Regards,

Cuong Hoang




Re: Couple of problems

2006-10-11 Thread Kevin Lewandowski

I've had a problem similar to this and it was because of the
schema.xml. It was valid XML but there were some incorrect field
definitions and/or the default field listed was not a defined field.

I'd suggest you start with the default schema and build on it piece by
piece, each time testing for the error with a "ping" operation in the
admin page.

Kevin

On 10/11/06, mark <[EMAIL PROTECTED]> wrote:

Hi,

I have installed solr under a stand alone tomcat5.5 installation. I
can see the admin screens etc.

When I submit documents I get this error

Oct 11, 2006 10:05:44 AM org.apache.solr.core.SolrException
logSEVERE: java.lang.NullPointerException
 at org.apache.solr.update.DocumentBuilder.addField
(DocumentBuilder.java:78)
at org.apache.solr.update.DocumentBuilder.addField
(DocumentBuilder.java:74)
at org.apache.solr.core.SolrCore.readDoc(SolrCore.java:917)
  at org.apache.solr.core.SolrCore.update(SolrCore.java:685)
at org.apache.solr.servlet.SolrUpdateServlet.doPost
(SolrUpdateServlet.java:52)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
.


My docs follow this schema:

  





   


Also - since getting this error I can no longer see part of the solr/
admin/stats.jsp screen - the boxes core, update , cache and other are
now empty. I deleted and reinstalled solr  (including the unpacked
webapps dir) but not tomcat and the problem is still there

cheers

mark



QTime field in response XML

2006-10-11 Thread Kevin Lewandowski

I've searched the docs but could not find an answer. Is this field
microseconds or milliseconds?

thanks,
Kevin


Re: QTime field in response XML

2006-10-11 Thread WHIRLYCOTT
Milliseconds.  I'd be fairly skeptical about anybody doing reliable  
millisecond timings on a jvm!


phil.

On Oct 11, 2006, at 3:05 PM, Kevin Lewandowski wrote:


I've searched the docs but could not find an answer. Is this field
microseconds or milliseconds?

thanks,
Kevin



--
   Whirlycott
   Philip Jacob
   [EMAIL PROTECTED]
   http://www.whirlycott.com/phil/





Re: QTime field in response XML

2006-10-11 Thread WHIRLYCOTT

On Oct 11, 2006, at 3:10 PM, WHIRLYCOTT wrote:
Milliseconds.  I'd be fairly skeptical about anybody doing reliable  
millisecond timings on a jvm!
  ^ 


Sorry, correcting myself.  That should have been 'micro'.

Timings are in milliseconds.

phil.


--
   Whirlycott
   Philip Jacob
   [EMAIL PROTECTED]
   http://www.whirlycott.com/phil/




Re: Solr use case

2006-10-11 Thread Erik Hatcher


On Oct 11, 2006, at 10:24 AM, climbingrose wrote:
Is it true that Solr is mainly used for applications that rarely  
change the

underlying data?


No, not at all.   Solr is very dynamic, and in fact shines even more  
than plain Lucene when the data changes frequently.



As I understand, if you submit new data or modify existing
data on Solr server, you would have to "refresh" the cache somehow to
display the updated data.


Solr manages this refresh automatically, and depending on how you  
have the caches configured the switchover to see new documents can be  
almost instantaneous.



If my application frequently gets new data/updates
from users, should I use Solr?


Well, that is a difficult question to answer without knowing more  
about your architecture, but Solr certainly would not be a hindrance  
and in fact may just be what makes your search system shine!



I love faceted browsing and dynamic
properties so much but I need to justify the choice of Solr.  
Thanks. By the
way, does anyone have any performance measure that can be shared  
(apart from
the one on the Wiki)? As I estimated, my application probably has  
half a
million docs, each of which has around 15 properties, does anyone  
know the

type of hardware I would need for reasonable performance.


I've gotten quite good response with a dataset of 500k documents on a  
MacBook Pro, with 1GB RAM.  I've not done any measuring, other than  
to experience that the front-end (RoR) was more than responsive enough.


Erik



Sorting

2006-10-11 Thread Gmail Account
I need to sort a query two ways. Should I do the search one way: 
s.getDocListAndSet(query, restrictions, sort, req.getStart(), 
req.getLimit(), flags);
then do the same search again with a different sort value or is there a 
method available to just sort the DocSet (like sortDocSet but it's 
protected)


OR maybe it doesn't  matter because caching will handle it anyway?

Thanks 



Re: Invalid XML in response

2006-10-11 Thread Chris Hostetter

: I don't understand why SOLR returns and invalid XML file as a response
: in case when we insert a document with a field that is not defined
: in the Solr configuration. Is there any purpose for that?
:
: It would be nice if it returns a valid xml

i think if you were adding only one doc, and it had a field problem, then
the response would be valid XML ... but it looks like you are adding
multiple docs, in which case even a success isn't valid XML at the
moment...

http://issues.apache.org/jira/browse/SOLR-2

...can you verify that this is really the same bug (two seperate 
blocks because of adding two seperate docs in a single request) ... or is
there something i'm missing in your example error besides that?

: ERROR:unknown field 'aaa'org.xmlpull.v1.XmlPullParserException: expected START_TAG or
: END_TAG not END_DOCUMENT (position: END_DOCUMENT seen
: ...\n\n... @9:1)
: at org.xmlpull.mxp1.MXParser.nextTag(MXParser.java:1083)
:at org.apache.solr.core.SolrCore.update(SolrCore.java:681)
: at
: org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:52)
: at javax.servlet.http.HttpServlet.service(HttpServlet.java:709)
: at javax.servlet.http.HttpServlet.service(HttpServlet.java:802)
:at
: 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:252)
: at
: 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:173)
:at
: 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
: at
: 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:178)
: at
: org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
: at
: org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
: at
: 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
: at
: org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
:at
: org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
: at
: 
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:664)
: at
: 
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
: at
: 
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
: at
: 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
:  at java.lang.Thread.run(Thread.java:595)
: 
:
:
: --
: Jestes kierowca? To poczytaj! >>> http://link.interia.pl/f199e
:



-Hoss



Re: Couple of problems

2006-10-11 Thread Chris Hostetter

: But this is a right out of the box install - I have only messed with
: the schema to suit me.

when i use your schema with the current trunk using Jetty, right at
startup my logs contain a "SolrException: Schema Parsing Failed" which is
wrapping...

Caused by: java.lang.RuntimeException: 'id' is not an indexed 
field:id{type=string,properties=stored}
at 
org.apache.solr.schema.IndexSchema.getIndexedField(IndexSchema.java:192)
at org.apache.solr.schema.IndexSchema.readConfig(IndexSchema.java:387)
... 21 more

...which is because if you want to use a uniqueKey field it must be
indexed so deletes can be done.

This didn't show up at all in your Tomcat logs on startup?  or the first
time you tried to do a search or an update?  (it's in the SolrServlet.init
method)




-Hoss



Re: Sorting

2006-10-11 Thread Chris Hostetter

: I need to sort a query two ways. Should I do the search one way:
: s.getDocListAndSet(query, restrictions, sort, req.getStart(),
: req.getLimit(), flags);
: then do the same search again with a different sort value or is there a
: method available to just sort the DocSet (like sortDocSet but it's
: protected)
:
: OR maybe it doesn't  matter because caching will handle it anyway?

check this out from the example solrconfig.xml...

   
true

...in those conditions, you should be able to just call getDocList (or
getDocListAndSet) with your various Sort options and the cache will take
care of everything.

if you *do* want scores to be included in one of the Sorts, then i would
try doing that search first using getDocListAndSet -- you can ignore the
DocSet, but the next call to getDocList should leverage the filterCache,
and the initial getDocListANdSet call hsould be faster then two seperate
getDocList calls with different sorts...

...i think.




-Hoss



Re: Sorting

2006-10-11 Thread Mike Austin

Let me back up.. for a second. I want to create price ranges. I was thinking
that I would do a search with a sort on price and create ranges by getting
the document price every (docCount / #ofpricerangesIwant). Basically create:
< 10, 10 - 60, 60 - 100 etc.. If the initial search wasn't sorted by price
then I would have to do the second search just to figure out the price
ranges.

This was the only way I could think to do it. Maybe I'm going at this the
wrong way?

Thanks

On 10/11/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:



: I need to sort a query two ways. Should I do the search one way:
: s.getDocListAndSet(query, restrictions, sort, req.getStart(),
: req.getLimit(), flags);
: then do the same search again with a different sort value or is there a
: method available to just sort the DocSet (like sortDocSet but it's
: protected)
:
: OR maybe it doesn't  matter because caching will handle it anyway?

check this out from the example solrconfig.xml...

  
   true

...in those conditions, you should be able to just call getDocList (or
getDocListAndSet) with your various Sort options and the cache will take
care of everything.

if you *do* want scores to be included in one of the Sorts, then i would
try doing that search first using getDocListAndSet -- you can ignore the
DocSet, but the next call to getDocList should leverage the filterCache,
and the initial getDocListANdSet call hsould be faster then two seperate
getDocList calls with different sorts...

   ...i think.




-Hoss




Re: Couple of problems

2006-10-11 Thread mark


This didn't show up at all in your Tomcat logs on startup?  or the  
first
time you tried to do a search or an update?  (it's in the  
SolrServlet.init

method)




Nope - not at all. Hmm - thanks for finding problem though - will try  
it in a bit






-Hoss