How does higlighting work?

2008-03-03 Thread Geir Gullestad Pettersen
I've added &hl=on&hl.fl=* to enable highlighting on all fields. However, the result does not contain any snippets of the matching terms, the only extra thing I get is the following: (..) "file:/H:/Documents/QUT/Report2205.doc" is the unique key for the document returned by the

multi core vs multi app

2008-03-03 Thread David Pratt
I am trying to decide whether it is best to work with multiple apps in a tomcat instance or run a single app with multiple cores. How do these options compare in terms of impact on RAM requirements. Is anyone using the multicore in production to suggest whether it is stable enough to use with

Re: Federated Search

2008-03-03 Thread Grégoire Neuville
Ok, thanks a lot for your answer. I'm going to investigate that way (Distributed Search) though after reading this (http://www.nabble.com/Multiple-Search-in-Solr-td15268564.html), I'll keep in mind the possibility of 'tweaking' Solr with the LuceneWebService as an inspiration tool ; I've also been

Solr in a highly memory constrained environment (ie. VPS) - stupid idea?

2008-03-03 Thread Micah Wedemeyer
Hi, I've used Solr a little at work where we have our own hardware with all the memory we want. However, I would also like to use Solr on a small-ish website that I run off of a VPS with 512MB of RAM. I tried this (untuned) for a while, and Tomcat/Solr would just grab up all my memory until

Re: Solr in a highly memory constrained environment (ie. VPS) - stupid idea?

2008-03-03 Thread Yonik Seeley
Yup, should definitely be doable with the small number of docs and traffic. - comment out the query cache in solrconfig.xml - reduce the size of the document cache to maybe 50 entries or so - reduce maxBufferedDocs to 10 to limit indexing memory (a better option will be available soon in Solr 1.3

RE: Proposition of a new feature: Dynamic Field Types

2008-03-03 Thread nicolas . dessaigne
> How many languages are you dealing with? The number of languages depends greatly on the project. We are usually dealing with 2 or 3 languages and I've yet to see a project with more than 5. > How are you generating your queries? With a specific handler (based on the DisMax) that have an extra

RE: Proposition of a new feature: Dynamic Field Types

2008-03-03 Thread nicolas . dessaigne
You're right Yonik, that's what I meant. As for an example, let us place ourselves in two situations: - Situation A: one field ("text") where the texts of all documents are indexed; - Situation B: one field per language ("text_en", "text_fr", etc.) where only the texts of documents in that languag

Re: multi core vs multi app

2008-03-03 Thread Otis Gospodnetic
Quick answers. 2 webapps one core/index each vs. 1 webapp with 2 cores (but there is also 1 webapp with 2 virtual webapps, one core/index each). If RAM is an issue, I'd think 1 webapp would be slightly gentler on your RAM. Don't think you can search against multiple cores "automatically" - i.e

out of memory every time

2008-03-03 Thread Justin
I'm indexing a large number of documents. As a server I'm using the /solr/example/start.jar No matter how much memory I allocate it fails around 7200 documents. I am committing every 100 docs, and optimizing every 300. all of my xml's contain on doc, and can range in size from 2k to 700k. when

Re: Strategy for handling large (and growing) index: horizontal partitioning?

2008-03-03 Thread Kevin Lewandowski
How many documents are in the index? If you haven't already done this I'd take a really close look at your schema and make sure you're only storing the things that should really be stored, same with the indexed fields. I drastically reduced my index size just by changing some indexed/stored option

Re: How does higlighting work?

2008-03-03 Thread Mike Klaas
On 3-Mar-08, at 1:26 AM, Geir Gullestad Pettersen wrote: I've added &hl=on&hl.fl=* to enable highlighting on all fields. However, the result does not contain any snippets of the matching terms, the only extra thing I get is the following: hl.fl=* is not a valid setting. You need to speci

Re: out of memory every time

2008-03-03 Thread Thorsten Scherler
On Mon, 2008-03-03 at 21:43 +0200, Justin wrote: > I'm indexing a large number of documents. > > As a server I'm using the /solr/example/start.jar > > No matter how much memory I allocate it fails around 7200 documents. How do you allocate the memory? Something like: java -Xms512M -Xmx1500M -ja

Re: out of memory every time

2008-03-03 Thread Reece
Just guessing, but I'd say it has something to do with the dynamic fields... I ran a similar operation (docs ranged from 1K to 2MB). For the initial indexing, I wrote a job to submit about 100,000 documents to solr, committing after every 10 docs. I never sent any optimize commands. I also used

RE: out of memory every time

2008-03-03 Thread Jae Joo
While the job is running, you can monitor the memory usage. Use the following command - jstat (you can find in the java/bin directory) jstat -gc 5s --> every 5 seconds. Jae -Original Message- From: Reece [mailto:[EMAIL PROTECTED] Sent: Mon 3/3/2008 8:20 PM To: solr-user@lucene.apache.

Re: multi core vs multi app

2008-03-03 Thread Ryan McKinley
Otis Gospodnetic wrote: Quick answers. 2 webapps one core/index each vs. 1 webapp with 2 cores (but there is also 1 webapp with 2 virtual webapps, one core/index each). If RAM is an issue, I'd think 1 webapp would be slightly gentler on your RAM. I think we should emphasize the *slightly*

Re: out of memory every time

2008-03-03 Thread Otis Gospodnetic
Hi, You are saying a doc can be up to 700KB and your maxBufferedDocs is set to 900. Multiply these two numbers and I think you'll see that this number is greater than your JVM's default heap. Also, save the optimize call for the end and your overall indexing time will be shorter. Otis -- S

Re: multi core vs multi app

2008-03-03 Thread David Pratt
Hi there. Many thanks for your replies. They have helped me determine a direction. It is a great thing to have both options available (and to better understand the pros and cons of each). Regards David Ryan McKinley wrote: Otis Gospodnetic wrote: Quick answers. 2 webapps one core/index each

Re: Strategy for handling large (and growing) index: horizontal partitioning?

2008-03-03 Thread James Brady
Hi Kevin, Thanks for your suggestions - I've got about 6 million, and am being quite stingy with my schema at the moment I'm afraid. If anything, the size of each document is going to go up, not down, but I might be able to prune some older, unused data. James On 3 Mar 2008, at 14:33, Kev