The point of the cached table is that we don't know where interesting rows are. Loading from a DB is much faster when you grab the first N rows, the next N rows, etc. So, some strategy which switches back and forth between searching for a requested ID v.s. grabbing blocks would be very efficient.
On Wed, Apr 7, 2010 at 1:17 PM, bbarani <bbar...@gmail.com> wrote: > > Hi, > > I just thought of sharing a suggestion for overcoming OOM issues with > CachedSQLEntityProcessor. > > Consider a scenario as below, > > If we have sub entities in DIH, > > <entity x query="select * from x"> ---> object > <entity y query="select * from y" > processor="cachedSqlEntityprocessor" cachekey=y.id cachevalue=x.id> --> > object properties > > cachedSqlEntityprocessor works as below, > > • First entity x will get executed and the entire table gets stored in > cache > • next entity y gets executed and entire table gets stored in cache > • Finally the comparison happens through hash map . > > Instead of this if it can process the child entities in batches (like for > 1000 parent id's) in each batch so that it doesnt have to cahce the entire > child table in memory but it just needs to fetch the child entities > corresponding to each batch. > > Something like this... > > <entity x query="select * from x”> ---> object --> cache the complete data > in parent > <entity y query="select * from y where uid in (pass 10000 > id's from parent entity and fetch just those from database)" > processor="cachedSqlEntityprocessor" cachekey=y.id cachevalue=x.id> --> > object properties > > As of now I got to know that DIH process the data on a row by row basis, if > we make the DIH process the data in batches it would help to resolve the OOM > issues. > > One thing is tat there will be more number of SQL queries issues by DIH when > we use this method but it would be a kind of hybrid approach to resolve both > memory / performance issues. > > Please let me know your thoughts. > > Thanks, > Barani > > > > > > > -- > View this message in context: > http://n3.nabble.com/Suggestion-for-cachedSQLentityprocessor-tp704158p704158.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Lance Norskog goks...@gmail.com