Re: Rsync

2006-03-29 Thread Bill Au
The segments file will change when new segments are created.  But what I
really
meant before was that the file deletable also changes when document are
deleted from
the index.

Bill

On 3/28/06, Bill Au <[EMAIL PROTECTED]> wrote:
>
> I think the segments file will also change if documents are deleted from
> the index.
>
> Other ways to distribute the index will works as long as:
>
> 1) it makes a copy of the index that is in a consistent state
>
> 2) it keeps track of files that have changed (normally only a small
> amount)
> and transfter them to the slave
>
> Lucene can certainly record a list of all new segment files added.  I
> think the tricky part
> is to ensure that a consistent copy of the index is being distributed.
>
> Bill
>
>
> On 3/27/06, jason rutherglen <[EMAIL PROTECTED]> wrote:
> >
> > I was thinking, would it not be possible to avoid using rsync and record
> > a list of all new segment files added (from within Lucene), and simply use
> > HTTP to sync down the newest ones?  Perhaps only using rsync after an
> > optimize?  Seems like if I understand Lucene correctly only new files are
> > created?
> >
> >
> >
>


faceted browsing

2006-03-29 Thread Erik Hatcher

I saw Yonik mentioned faceted browsing as something coming in the
future of Solr, but I had thought it was one of the initial features
from seeing this announcement ages ago:

	


If facets are part of the current Solr codebase, how are they
configured and returned in the response?

If they aren't currently possible with Solr, what would it take to
implement it?

I'm still, obviously, just scratching the surface of Solr as I
evaluate it for replacing my custom XML-RPC based search server which
does rudimentary facets using Filters and BitSet operations.

By faceted browsing, a Query is used to search, Hits are returned,  
but also based on a subset of the fields (indexed, untokenized  
fields) the number of documents in each of these "facet" fields is  
returned as well to show counts by each facet.


Thanks,
Erik


Re: Rsync

2006-03-29 Thread jason rutherglen
Perhaps a future project to increase the speed of the syncing to sub-minute 
times.  Sounds like two files will change, in addition to segment files being 
added.  Is this correct?  Or maybe other pieces such as cache reloading would 
make this more difficult.  

- Original Message 
From: Bill Au <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org; jason rutherglen <[EMAIL PROTECTED]>
Sent: Wednesday, March 29, 2006 6:07:35 AM
Subject: Re: Rsync

The segments file will change when new segments are created.  But what I
really
meant before was that the file deletable also changes when document are
deleted from
the index.

Bill

On 3/28/06, Bill Au <[EMAIL PROTECTED]> wrote:
>
> I think the segments file will also change if documents are deleted from
> the index.
>
> Other ways to distribute the index will works as long as:
>
> 1) it makes a copy of the index that is in a consistent state
>
> 2) it keeps track of files that have changed (normally only a small
> amount)
> and transfter them to the slave
>
> Lucene can certainly record a list of all new segment files added.  I
> think the tricky part
> is to ensure that a consistent copy of the index is being distributed.
>
> Bill
>
>
> On 3/27/06, jason rutherglen <[EMAIL PROTECTED]> wrote:
> >
> > I was thinking, would it not be possible to avoid using rsync and record
> > a list of all new segment files added (from within Lucene), and simply use
> > HTTP to sync down the newest ones?  Perhaps only using rsync after an
> > optimize?  Seems like if I understand Lucene correctly only new files are
> > created?
> >
> >
> >
>





Re: faceted browsing

2006-03-29 Thread Yonik Seeley
Solr has a lot of support to do faceted browsing, but one must
currently write a custom query handler to implement the faceting
logic.

The support includes:
  - custom query handlers:
  - the ability to return more data than just a list of documents
  - a filter cache with autowarming, for fast access to the filter for
each facet
  - more memory efficient and faster intersecting filter representations

The part I want in the future is simple faceted browsing without
having to write any plugins or Java code..  so we need to come up with
a syntax to represent the desired faceting operations, and then
implement that syntax in the standard request handler.

To implement a custom query handler, you need to implement SolrRequestHandler
http://incubator.apache.org/solr/docs/api/org/apache/solr/request/SolrRequestHandler.html
and register it in solrconfig.xml

-Yonik

On 3/29/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> I saw Yonik mentioned faceted browsing as something coming in the
> future of Solr, but I had thought it was one of the initial features
> from seeing this announcement ages ago:
>
>  Product-Category-Listings-t266441.html#a748420>
>
> If facets are part of the current Solr codebase, how are they
> configured and returned in the response?
>
> If they aren't currently possible with Solr, what would it take to
> implement it?
>
> I'm still, obviously, just scratching the surface of Solr as I
> evaluate it for replacing my custom XML-RPC based search server which
> does rudimentary facets using Filters and BitSet operations.
>
> By faceted browsing, a Query is used to search, Hits are returned,
> but also based on a subset of the fields (indexed, untokenized
> fields) the number of documents in each of these "facet" fields is
> returned as well to show counts by each facet.
>
> Thanks,
> Erik
>


Re: Rsync

2006-03-29 Thread Yonik Seeley
On 3/29/06, jason rutherglen <[EMAIL PROTECTED]> wrote:
> Perhaps a future project to increase the speed of the syncing to sub-minute 
> times.  Sounds like two files will change, in addition to segment files being 
> added.  Is this correct?  Or maybe other pieces such as cache reloading would 
> make this more difficult.

rsync will only copy over the changed index files, not the whole index
each time.

-Yonik


Re: faceted browsing

2006-03-29 Thread Clay Webster
How could faceted browsing be accomplished without [Chris's] metadata
documents?

--cw

On 3/29/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> Solr has a lot of support to do faceted browsing, but one must
> currently write a custom query handler to implement the faceting
> logic.
>
> The support includes:
>   - custom query handlers:
>   - the ability to return more data than just a list of documents
>   - a filter cache with autowarming, for fast access to the filter for
> each facet
>   - more memory efficient and faster intersecting filter representations
>
> The part I want in the future is simple faceted browsing without
> having to write any plugins or Java code..  so we need to come up with
> a syntax to represent the desired faceting operations, and then
> implement that syntax in the standard request handler.
>
> To implement a custom query handler, you need to implement
> SolrRequestHandler
>
> http://incubator.apache.org/solr/docs/api/org/apache/solr/request/SolrRequestHandler.html
> and register it in solrconfig.xml
>
> -Yonik
>
> On 3/29/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:
> > I saw Yonik mentioned faceted browsing as something coming in the
> > future of Solr, but I had thought it was one of the initial features
> > from seeing this announcement ages ago:
> >
> >  > Product-Category-Listings-t266441.html#a748420>
> >
> > If facets are part of the current Solr codebase, how are they
> > configured and returned in the response?
> >
> > If they aren't currently possible with Solr, what would it take to
> > implement it?
> >
> > I'm still, obviously, just scratching the surface of Solr as I
> > evaluate it for replacing my custom XML-RPC based search server which
> > does rudimentary facets using Filters and BitSet operations.
> >
> > By faceted browsing, a Query is used to search, Hits are returned,
> > but also based on a subset of the fields (indexed, untokenized
> > fields) the number of documents in each of these "facet" fields is
> > returned as well to show counts by each facet.
> >
> > Thanks,
> > Erik
> >
>


Re: faceted browsing

2006-03-29 Thread Yonik Seeley
On 3/29/06, Clay Webster <[EMAIL PROTECTED]> wrote:
> How could faceted browsing be accomplished without [Chris's] metadata
> documents?

The most basic form:

consider if a field called "category" existed on each document.
You could then ask for the counts of the top 10 values in category
field for all of the documents matching a query.

Possible syntax:   my user query; groupByField(category,10)

Another form would require the user to enumerate the facets... this
would work well for things like price ranges:

Possible syntax:   my user query; groupByQueries(price:[0 TO 10},
price:[10 TO 100}, price:[100 TO 1000})

And of course, one would want to be able to specify them all in a single query:

my user query; groupByField(category,10), groupByField(author,20),
groupByQueries(price:[0 TO 10}, price:[10 TO 100}, price:[100 TO
1000})


The thing that Chris' metadata documents also did was tell you *what*
facets to do, but that logic could also be kept in the client. 
Standardizing that is probably currently beyond the scope of what we
could put in the standard request handler.

-Yonik


Solr admin page error on Solaris

2006-03-29 Thread Michael Levy

Hi,

Just trying to get Solr running.  I can run it OK on my Windows machine 
as per the tutorial.  However, on my Solaris machine I'm having 
trouble.  My newly-installed Tomcat is running the various examples, 
seems to be running OK.

Running:
Tomcat version  5.5.16 
Java version 1.5.0_06-b05

Running Solaris 5.9 on   sparc

I put last night's solr.war file in webapps and it shows up OK in Tomcat 
manager and admin.  I then pointed my browser to 
http://mydomain:8080/solr/admin/ and the resulting page looks like 
below.  Any help appreciated, thanks!


org.apache.jasper.JasperException: Exception in JSP: /admin/_info.jsp:7
4: 
5: 
6: <%@ page import="java.util.Date"%>
7: 
8: 

9: 
10: <[EMAIL PROTECTED] file="header.jsp" %>

Stacktrace:

org.apache.jasper.servlet.JspServletWrapper.handleJspException(JspServletWrapper.java:504)

org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:375)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

*root cause*

javax.servlet.ServletException

org.apache.jasper.runtime.PageContextImpl.doHandlePageException(PageContextImpl.java:858)

org.apache.jasper.runtime.PageContextImpl.handlePageException(PageContextImpl.java:791)
org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:262)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

*root cause*

java.lang.NoClassDefFoundError
org.apache.jsp.admin.index_jsp._jspService(index_jsp.java:67)
org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)

org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:332)
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:314)
org.apache.jasper.servlet.JspServlet.service(JspServlet.java:264)
javax.servlet.http.HttpServlet.service(HttpServlet.java:802)




Large test lucene DB to use with a SolR benchmark

2006-03-29 Thread Ian Holsman

Hi.

I'd like to do a benchmark on how SolR performs on the new Sun T2000  
hardware (compared to a x86-64) machine (why? I'm trying to get the  
HW for free ) check out http://sunfirefan.com for more details). but  
in order to do this I need a lucene db to test against.


does any one know of a publicly available production size database  
which I can use?  otherwise I will generate something off dmoz or  
randomly.



regards
Ian.
--
Ian Holsman
Zilbo.com / (425) 296-6771 USA/ ++61 (03) 9877-0909 Australia

A good hockey player plays where the puck is. A great hockey player  
plays where the puck is going to be.

 -- Wayne Gretzky