Hi, I think at least before lucene 4.0 we can only allow one process/thread to write on a lucene folder. Based on this fact my initial plan is:
1) There is one set of lucene index folders. 2) Solr server only perform queries in those servers 3) Having a separate process (multi-threads) to index those lucene folders (each folder is a separate app). Only one thread will index one given lucene folder. Thanks very much for helps, Lisheng -----Original Message----- From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com] Sent: Thursday, July 26, 2012 10:15 AM To: solr-user@lucene.apache.org Subject: Re: Bulk indexing data into solr Coming back to your original question. I'm puzzled a little. It's not clear where you wanna call Lucene API directly from. if you mean that you has standalone indexer, which write index files. Then it stops and these files become available for Solr Process it will work. Sharing index between processes, or using EmbeddedServer is looking for problem (despite Lucene has Locks mechanism, which I'm not completely aware of). I can conclude that your data for indexing is collocate with the solr server. In this case consider http://wiki.apache.org/solr/ContentStream#RemoteStreaming Please give more details about your design. On Thu, Jul 26, 2012 at 1:22 PM, Zhang, Lisheng < lisheng.zh...@broadvision.com> wrote: > > Hi, > > I am starting to use solr, now I need to index a rather large amount of > data, it seems > that calling solr to pass data through HTTP is rather inefficient, I am > think still call > lucene API directly for bulk index but to use solr for search, is this > design OK? > > Thanks very much for helps, Lisheng > > -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>