Re: Embarrasing compilation errors with solr-nightly/example

2006-06-29 Thread Yonik Seeley

On 6/28/06, Chris Hostetter <[EMAIL PROTECTED]> wrote:

Interesting ... it never occured to me that the demo would require a
JDK instead of a JRE so that Jetty can compile the JSPs ... but it makes
sense.


It didn't require a JDK previously.
The downgrade in Jetty caused a JDK to be required because that
version doesn't use the JDT  for JSP/Java compilation.
Neither Tomcat 5.5 or Jetty6 require a JDK.

This is an issue even if you have a JDK installed on your system
because often "java" from the JRE will be in your path first.

-Yonik


Re: When to commit?

2006-06-29 Thread Bill Au

Also keep in mind that changes and additions to the index is not visible by
the
clients until a commit has occured.

Bill

On 6/28/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:



On Jun 28, 2006, at 3:43 PM, UpAndGone wrote:
> Another question: Since a commit is expensive, it shouldn't be send
> after
> each update, should it? Would it make sense to commit in a certain
> time
> frame, let's say every 5 minutes? For a high traffic site, would five
> minutes be okay or is it still to expensive?

I do batch, pretty much one-shot, kinda indexing currently (loading
about 40k documents currently) and I had to tune it to send a  frequently otherwise Solr would run out of memory.  I sent my
indexer to send a commit every 100 documents.  The auto-warming is
slightly expensive, but not currently prohibitively so, in my
scenario but is rock solid.

But, it's not really expensive from the outside clients perspective
because Solr just keeps serving up things (very rapidly!) while auto-
warming and then flips over to the updated index when warmed up.  So
if you're doing intermittent incremental indexing you probably are
fine just doing a commit for every change.  If you're doing batch
indexing, spreading commits thinner is more appropriate.

Just some off the cuff experiences, that's all.

Erik




Re: Faceted Browsing questions

2006-06-29 Thread Erik Hatcher


On Jun 29, 2006, at 12:30 AM, Vish D. wrote:
Any update on your progress? Eager to get my hands on on your  
latest code...

:=)


It's all in our Subversion repository:



Sorry I didn't announce it, but we do have a patacriticism- 
development e-mail list that you can subscribe to for commit messages.


I've got a dual object-type facet cache going on now, where  
TermQuery's are cached for most facets making them quite lightweight  
and currently fitting all of our faceted fields nicely in RAM.   
However, I'm layering one level of "relationship" between different  
document _types_ in Solr that is a cross-reference of tags->objects  
and usernames->objects, where the objects are the basic document type  
(type:A - for "archive") in Solr, and type:C documents are the  
folksonomical glue between a user and an object, supporting tagging  
and annotations currently.  These relationship "facets" are currently  
DocSet caches, but they fit into the same cache so the front-end can  
constrain the search space by agent (aka author/artist/etc), genre,  
archive, year, users and tags as if they were all the same sort of  
thing.


We're currently having some sysadmin folks get us set up with a  
production environment to run this thing.  All the pieces are there  
in our repository to bootstrap the Collex system by launching two  
command-lines, one for RoR and one for Solr via Jetty.  And an Ant  
build file "index" target to index a directory full of (our custom  
flavor of) RDF into Solr.  No instructions are yet available on  
bootstrapping it all just yet.  But feel free to tinker if you like.  :)


Erik