Re: performance testing practices

Erik Hatcher Mon, 05 Feb 2007 08:59:10 -0800


On Feb 5, 2007, at 11:15 AM, Yonik Seeley wrote:

On 2/5/07, Erik Hatcher <[EMAIL PROTECTED]> wrote:

This week I'm going to be incrementally loading up to 3.7M records
into Solr, in 50k chunks.

I'd like to capture some performance numbers after each chunk to see
how she holds up.

What numbers are folks capturing?  What techniques are you using to
capture numbers?  I'm not looking for anything elaborate, as the goal
is really to see how faceting fares as more data is loaded.  We've
got some ugly data in our initial experiment, so the faceting
concerns me.


Gulp... me too.  That sounds like a lot of data, and the faceting code
is still young (it will get better with age :-)
The big performance factor for faceting will relate to the number of
unique values in a field.
So what are you trying to facet on?

The facets are bibliographic metadata about library holdings, such asgenre, subject, format, published date (year), and others. Basicallyan open source think like this:

<http://www2.lib.ncsu.edu/catalog/?N=201015&Ns=Call+Number+sort%7c0&sort=5>

(if that link didn't work, hit the main page at <http://www.lib.ncsu.edu/catalog/browse.html> and drill in a little)

The data is real ugly, and there are typically several values perfield, so all facets are currently set as multiValued.


We shall see!

        Erik

Re: performance testing practices

Reply via email to