The point of the cached table is that we don't know where interesting
rows are. Loading from a DB is much faster when you grab the first N
rows, the next N rows, etc. So, some strategy which switches back and
forth between searching for a requested ID v.s. grabbing blocks would
be very efficient.

On Wed, Apr 7, 2010 at 1:17 PM, bbarani <bbar...@gmail.com> wrote:
>
> Hi,
>
> I just thought of sharing a suggestion for overcoming OOM issues with
> CachedSQLEntityProcessor.
>
> Consider a scenario as below,
>
> If we have sub entities in DIH,
>
> <entity x query="select * from x"> ---> object
>                                <entity y query="select * from y"
> processor="cachedSqlEntityprocessor" cachekey=y.id cachevalue=x.id> -->
> object properties
>
> cachedSqlEntityprocessor works as below,
>
> •       First entity x will get executed and the entire table gets stored in 
> cache
> •       next entity y gets executed and entire table gets stored in cache
> •       Finally the comparison happens through hash map .
>
> Instead of this if it can process the child entities in batches (like for
> 1000 parent id's) in each batch so that it doesnt have to cahce the entire
> child table in memory but it just needs to fetch the child entities
> corresponding to each batch.
>
> Something like this...
>
> <entity x query="select * from x”> ---> object --> cache the complete data
> in parent
>                <entity y query="select * from y where uid in (pass 10000
> id's from parent entity and fetch just those from database)"
> processor="cachedSqlEntityprocessor" cachekey=y.id cachevalue=x.id> -->
> object properties
>
> As of now I got to know that DIH process the data on a row by row basis, if
> we make the DIH process the data in batches it would help to resolve the OOM
> issues.
>
> One thing is tat there will be more number of SQL queries issues by DIH when
> we use this method but it would be a kind of hybrid approach to resolve both
> memory / performance issues.
>
> Please let me know your thoughts.
>
> Thanks,
> Barani
>
>
>
>
>
>
> --
> View this message in context: 
> http://n3.nabble.com/Suggestion-for-cachedSQLentityprocessor-tp704158p704158.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to