Re: fastest way to index/reindex

2009-02-25 Thread Josiane Gamgo
Thanks for your Answer. this is what I am trying to do : I would like to find out how to customize the Lucene Indexing Prozess to obtain a faster search. etheir with Luke or with some other tool. On Mon, Feb 23, 2009 at 6:53 PM, Erick Erickson wrote: > please don't hijack topic threads, start a

Re: fastest way to index/reindex

2009-02-23 Thread Erick Erickson
please don't hijack topic threads, start a new one http://en.wikipedia.org/wiki/Thread_hijacking Best Erick MergeFactor isn't very related to searching, Luke isn't used in the indexing process and why do you care how fast Luke is? When you start a new post on this topic, please give an idea of

Re: fastest way to index/reindex

2009-02-23 Thread Josiane Gamgo
How fast is the search if the MergeFactor of Lucene Index is set to 20 or more?did somebody uses Luke to optimize the indexing process? I would like to know how fast is Luke. Thanks On Tue, Jan 27, 2009 at 3:52 PM, Ian Connor wrote: > When you query by *:*, what order does it use. Is there a ch

Re: fastest way to index/reindex

2009-01-27 Thread Erik Hatcher
*:* will default to sorting by document insertion order (Lucene's document id, _not_ your Solr uniqueKey). And no, you won't miss any by paging - order will be maintained. Erik On Jan 27, 2009, at 9:52 AM, Ian Connor wrote: When you query by *:*, what order does it use. Is there a

Re: fastest way to index/reindex

2009-01-27 Thread Ian Connor
When you query by *:*, what order does it use. Is there a chance they will come in a different order as you page through the results (and miss/dupicate some). Is it best to put the order explicitly by 'id' or is that implied already? On Mon, Jan 26, 2009 at 12:00 PM, Ian Connor wrote: > *:* took

Re: fastest way to index/reindex

2009-01-26 Thread Ian Connor
*:* took it up to 45/sec from 28/sec so a nice 60% bump in performance - thanks! On Sun, Jan 25, 2009 at 5:46 PM, Ryan McKinley wrote: > I don't know of any standard export/import tool -- i think luke has > something, but it will be faster if you write your own. > > Rather then id:[* TO *], just

Re: fastest way to index/reindex

2009-01-26 Thread Ian Connor
I have about 2.5 million per shard and seem to be getting through 28/sec using a 1000 at a time. It ran all yesterday and part of the night. It is over the 1.6 million mark now so hope it can keep up a similar rate as it gets deeper into the index. I need to reindex it all because I changed how so

Re: fastest way to index/reindex

2009-01-26 Thread Julian Davchev
I kinda don't get why would you reindex all data at once? Each document has unique id you will reindex only whats needed. Also if too many stuff I'd suggest using some batch processor that will add N tasks with range query 1:10 10:20 etc... and cronjob executing those. Thousends seems ok but w

Re: fastest way to index/reindex

2009-01-25 Thread Ryan McKinley
I don't know of any standard export/import tool -- i think luke has something, but it will be faster if you write your own. Rather then id:[* TO *], just try *:* -- this should match all documents without using a range query. On Jan 25, 2009, at 3:16 PM, Ian Connor wrote: Hi, Given the