Re: Indexing strategies?

Erick Erickson Wed, 12 Feb 2014 07:54:00 -0800

I'd seriously consider a SolrJ program that pulled the necessary data from
two of your systems, held it in cache and then pulled the data from your
main system and enriched it with the cached data.


Or export your information from your remote systems and import them into
a single system where you could do joins.

I believe DIH has some caching ability too that you might consider.

Your basic problem is an inefficient data model where you have to query
these different systems on a row-by-row system, that's where I'd concentrate
my energies..

Best,
Erick


On Wed, Feb 12, 2014 at 2:09 AM, manju16832003 <manju16832...@gmail.com>wrote:

> Hi,
> I'm facing a dilemma of choosing the indexing strategies.
> My application architecture is
>  - I have a listing table in my DB
>  - For each listing, I have 3 calls to a URL Datasource of different system
>
>  I have 200k records
>
>  Time taken to index 25 docs is 1Minute, so for 200k it might take more
> than
> 100hrs :-(?
>
>
>  I know there are lot of factors to consider from Network to DB.
> I'm looking for different strategies that we could perform index.
>
>  - Can we run multiple data import handlers? one data-config for first 100k
> and second one is for another 100k
>  - Would it be possible to write java service using SolrJ and perform
> multi-threaded calls to Solr to Index?
>  - The URL Datasources i'm using is actually resided in MSSQL database of
> different system. Could I be able to fasten indexing time if I just could
> use JDBCDataSource that calls DB directly instead through API URL data
> source?
>
> Is there any other strategies we could use?
>
> Thank you,
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Indexing-strategies-tp4116852.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Indexing strategies?

Reply via email to