- Is the solr php in the wiki working out of the box for anyone?
show your php.ini. did you performance your php?




2006/9/10, Brian Lucas <[EMAIL PROTECTED]>:

Hi Michael,

I apologize for the lack of testing on the SolPHP.  I had to "strip" it
down
significantly to turn it into a general class that would be usable and the
version up there has not been extensively tested yet (I'm almost ready to
get back to that and "revise" it), plus much of my coding is done in Rails
at the moment.  However...

If you have a new version, could you send it over my way or just upload it
to the wiki?  I'd like to take a look at the changes and throw your
revised
version up there or integrate both versions into a cleaner revision of the
version already there.

With respect to batch queries, it's already designed to do that (that's
why
you see "array($array)" in the example, because it accepts an array of
updates) but I'd definitely like to see how you revised it.

Thanks,
Brian


-----Original Message-----
From: Michael Imbeault [mailto:[EMAIL PROTECTED]
Sent: Saturday, September 09, 2006 12:30 PM
To: solr-user@lucene.apache.org
Subject: Got it working! And some questions

First of all, in reference to
http://www.mail-archive.com/solr-user@lucene.apache.org/msg00808.html ,
I got it working! The problem(s) was coming from solPHP; the
implementation in the wiki isn't really working, to be honest, at least
for me. I had to modify it significantly at multiple places to get it
working. Tomcat 5.5, WAMP and Windows XP.

The main problem was that addIndex was sending 1 doc at a time to solr;
it would cause a problem after a few thousand docs because i was running
out of resources. I modified solr_update.php to handle batch queries,
and i'm now sending batches of 1000 docs at a time. Great indexing speed.

Had a slight problem with the curl function of solr_update.php; the
custom HTTP header wasn't recognized; I now use curl_setopt($ch,
CURLOPT_POST, 1); curl_setopt($ch, CURLOPT_POSTFIELDS, $post_string); -
much simpler, and now everything works!

Up so far I indexed 15.000.000 documents (my whole collection,
basically) and the performance i'm getting is INCREDIBLE (sub 100ms
query time without warmup and no optimization at all on a 7 gigs index -
and with the cache, it gets stupid fast)! Seriously, Solr amaze me every
time I use it. I increased HashDocSet Maxsize to 75000, will continue to
optimize this value - it helped a great deal. I will try disMaxHandler
soon too; right now the standard one is great. And I will index with a
better stopword file; the default one could really use improvements.

Some questions (couldn't find the answer in the docs):

- Is the solr php in the wiki working out of the box for anyone? Else we
could modify the wiki...

- What is the loadFactor variable of HashDocSet? Should I optimize it too?

- What's the units on the size value of the caches? Megs, number of
queries, kilobytes? Not described anywhere.

- Any way to programatically change the OR/AND preference of the query
parser? I set it to AND by default for user queries, but i'd like to set
it to OR for some server-side queries I must do (find related articles,
order by score).

- Whats the difference between the 2 commits type? Blocking and
non-blocking. Didn't see any differences at all, tried both.

- Every time I do an <optimize> command, I get the following in my
catalina logs - should I do anything about it?

9-Sep-2006 2:24:40 PM org.apache.solr.core.SolrException log
SEVERE: Exception during commit/optimize:java.io.EOFException: no more
data available - expected end tag </optimize> to close start tag
<optimize> from line 1, parser stopped on START_TAG seen <optimize>...
@1:10

- Any benefits of setting the allowed memory for Tomcat higher? Right
now im allocating 384 megs.

Can't wait to try the new Faceted Queries... seriously, solr is really,
really awesome up so far. Thanks for all your work, and sorry for all
the questions!

--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212


Reply via email to