RE: Bulk indexing data into solr

Zhang, Lisheng Thu, 26 Jul 2012 11:48:23 -0700

Hi,

I think at least before lucene 4.0 we can only allow one process/thread to 
write on
a lucene folder. Based on this fact my initial plan is:

1) There is one set of lucene index folders.
2) Solr server only perform queries in those servers
3) Having a separate process (multi-threads) to index those lucene folders 
(each 
   folder is a separate app). Only one thread will index one given lucene 
folder.

Thanks very much for helps, Lisheng

-----Original Message-----
From: Mikhail Khludnev [mailto:mkhlud...@griddynamics.com]
Sent: Thursday, July 26, 2012 10:15 AM
To: solr-user@lucene.apache.org
Subject: Re: Bulk indexing data into solr

Coming back to your original question. I'm puzzled a little.
It's not clear where you wanna call Lucene API directly from.
if you mean that you has standalone indexer, which write index files. Then
it stops and these files become available for Solr Process it will work.
Sharing index between processes, or using EmbeddedServer is looking for
problem (despite Lucene has Locks mechanism, which I'm not completely aware
of).
I can conclude that your data for indexing is collocate with the solr
server. In this case consider
http://wiki.apache.org/solr/ContentStream#RemoteStreaming

Please give more details about your design.

On Thu, Jul 26, 2012 at 1:22 PM, Zhang, Lisheng <
lisheng.zh...@broadvision.com> wrote:

>
> Hi,
>
> I am starting to use solr, now I need to index a rather large amount of
> data, it seems
> that calling solr to pass data through HTTP is rather inefficient, I am
> think still call
> lucene API directly for bulk index but to use solr for search, is this
> design OK?
>
> Thanks very much for helps, Lisheng
>
>

-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <mkhlud...@griddynamics.com>

RE: Bulk indexing data into solr

Reply via email to