By "global" do you mean Solr as the search solution for all those
collections, or do you mean having all those different types of
documents (jobs, autos, classifieds) in a single Solr index?
Yes I did. I envisioned separating them by custom fields named "vertical"
and then within vertical "category"

Unless there is a good reason to put multiple document types in the
same index, you will get better performance by putting them in their
own index.
So my educated guess would be that I would create additional "schema" xml
elements in my schema.xml separately for jobs, homes, cars, news, obits, etc
( in the tutorial, I note the schema name "example") and my search query
strings would have to specify which schema to use in the query, but I don't
see a variable for "schema".

NumDocs: It looks like I am going to have an index of about 300,000
documents initially and should grow by about 150 per day..


On 6/2/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:

On 6/2/06, Tim Archambault <[EMAIL PROTECTED]> wrote:
> That'll be fine. As you can probably tell, I'm not a programmer. I am
just a
> dangerous end-user with expertise in marketing & online operations
trying to
> save a buck. I am going to try to learn XSL or if that doesn't work,
I'll
> bastardize the results into a coldfusion recordset.
>
> I know I shouldn't ask you questions directly, but I have to ask you.
>
> How many queries per minute can Solr handle in a high use situation?

It depends on how many documents are in the collection, the nature of
the documents (unique terms, size of fields, etc), and heavily depends
on the nature of the queries, and the CPU and memory of your hardware.

I've seen up to 1000 queries/sec for very simple queries on a 1M doc
index.

> Our
> website gets about 4 million page views a month and about 40,000 daily
> visitors,

That shouldn't be a problem unless the collection is just too big.
It's pretty easy to scale Solr to higher query traffic by putting more
query servers behind a load balancer, *provided* that the latency of a
single query is acceptable.  If the collection is too big (to many
documents, to big of documents), then you need to split up the
collection and use federated search (Solr doesn't have it yet, but it
will in the future).

> I am envisioning Solr
> being the search engine for our jobs, autos, classifieds, and as a
"global"
> search experience that includes them all. I really want to greatly limit
the
> use of database connections on our site. Do you think Solr can be a
"global"
> solution for search on our site.

By "global" do you mean Solr as the search solution for all those
collections, or do you mean having all those different types of
documents (jobs, autos, classifieds) in a single Solr index?

Unless there is a good reason to put multiple document types in the
same index, you will get better performance by putting them in their
own index.

> Which java-based web server component do you recommend for a windows
> platform? Tomcat? Another? I know nothing about these tools. I am using
> Jetty for testing.

Tomcat is the most widely used I think... and therefore easier to find
docs and find help/support for it.  I started a little Tomcat
installation guide on the Wiki last night.

-Yonik

Reply via email to