Re: Indexing large documents

2014-03-19 Thread Tom Burton-West
Hi Stephen, We regularly index documents in the range of 500KB-8GB on machines that have about 10GB devoted to Solr. In order to avoid OOM's on Solr versions prior to Solr 4.0, we use a separate indexing machine(s) from the search server machine(s) and also set the termIndexInterval to 8 times th

Re: Indexing large documents

2014-03-19 Thread Alexei Martchenko
Even the most non-structured data has to have some breakpoint. I've seen projects running solr that used to index whole books one document per chapter plus a synopsis boosted doc. The question here is how you need to search and match those docs. alexei martchenko Facebook

Re: Indexing large documents

2014-03-18 Thread Otis Gospodnetic
Hi, I think you probably want to split giant documents because you / your users probably want to be able to find smaller sections of those big docs that are best matches to their queries. Imagine querying War and Peace. Almost any regular word your query for will produce a match. Yes, you may w

Re: Indexing large documents

2007-08-20 Thread Fouad Mardini
thanks, i reindexed the documents and now it works, there was an issue with text extraction it seems. I also changed the maxFieldLength and it must have helped thanks On 8/20/07, Pieter Berkel <[EMAIL PROTECTED]> wrote: > > You will probably need to increase the value of maxFieldLength in your >

Re: Indexing large documents

2007-08-20 Thread Pieter Berkel
You will probably need to increase the value of maxFieldLength in your solrconfig.xml. The default value is 1 which might explain why your documents are not being completely indexed. Piete On 20/08/07, Peter Manis <[EMAIL PROTECTED]> wrote: > > The that should show some errors if something

Re: Indexing large documents

2007-08-20 Thread Peter Manis
The that should show some errors if something goes wrong, if not the console usually will. The errors will look like a java stacktrace output. Did increasing the heap do anything for you? Changing mine to 256mb max worked fine for all of our files. On 8/20/07, Fouad Mardini <[EMAIL PROTECTED]>

Re: Indexing large documents

2007-08-20 Thread Fouad Mardini
Well, I am using the java textmining library to extract text from documents, then i do a post to solr I do not have an error log, i only have *.request.log files in the logs directory Thanks On 8/20/07, Peter Manis <[EMAIL PROTECTED]> wrote: > > Fouad, > > I would check the error log or console f

Re: Indexing large documents

2007-08-20 Thread Peter Manis
Fouad, I would check the error log or console for any possible errors first. They may not show up, it really depends on how you are processing the word document (custom solr, feeding the text to it, etc). We are using a custom version of solr with PDF, DOC, XLS, etc text extraction and I have suc

RE: Indexing large documents

2007-08-20 Thread praveen jain
Hi I want to know how to update my .xml file which have other field then the default one , so which file o have to modify, and how. pRAVEEN jAIN +919890599250 -Original Message- From: Fouad Mardini [mailto:[EMAIL PROTECTED] Sent: Monday, August 20, 2007 4:00 PM To: solr-user@lucene.ap