By "global" do you mean Solr as the search solution for all those collections, or do you mean having all those different types of documents (jobs, autos, classifieds) in a single Solr index? Yes I did. I envisioned separating them by custom fields named "vertical" and then within vertical "category"
Unless there is a good reason to put multiple document types in the same index, you will get better performance by putting them in their own index. So my educated guess would be that I would create additional "schema" xml elements in my schema.xml separately for jobs, homes, cars, news, obits, etc ( in the tutorial, I note the schema name "example") and my search query strings would have to specify which schema to use in the query, but I don't see a variable for "schema". NumDocs: It looks like I am going to have an index of about 300,000 documents initially and should grow by about 150 per day.. On 6/2/06, Yonik Seeley <[EMAIL PROTECTED]> wrote:
On 6/2/06, Tim Archambault <[EMAIL PROTECTED]> wrote: > That'll be fine. As you can probably tell, I'm not a programmer. I am just a > dangerous end-user with expertise in marketing & online operations trying to > save a buck. I am going to try to learn XSL or if that doesn't work, I'll > bastardize the results into a coldfusion recordset. > > I know I shouldn't ask you questions directly, but I have to ask you. > > How many queries per minute can Solr handle in a high use situation? It depends on how many documents are in the collection, the nature of the documents (unique terms, size of fields, etc), and heavily depends on the nature of the queries, and the CPU and memory of your hardware. I've seen up to 1000 queries/sec for very simple queries on a 1M doc index. > Our > website gets about 4 million page views a month and about 40,000 daily > visitors, That shouldn't be a problem unless the collection is just too big. It's pretty easy to scale Solr to higher query traffic by putting more query servers behind a load balancer, *provided* that the latency of a single query is acceptable. If the collection is too big (to many documents, to big of documents), then you need to split up the collection and use federated search (Solr doesn't have it yet, but it will in the future). > I am envisioning Solr > being the search engine for our jobs, autos, classifieds, and as a "global" > search experience that includes them all. I really want to greatly limit the > use of database connections on our site. Do you think Solr can be a "global" > solution for search on our site. By "global" do you mean Solr as the search solution for all those collections, or do you mean having all those different types of documents (jobs, autos, classifieds) in a single Solr index? Unless there is a good reason to put multiple document types in the same index, you will get better performance by putting them in their own index. > Which java-based web server component do you recommend for a windows > platform? Tomcat? Another? I know nothing about these tools. I am using > Jetty for testing. Tomcat is the most widely used I think... and therefore easier to find docs and find help/support for it. I started a little Tomcat installation guide on the Wiki last night. -Yonik