I am starting to look at Solr's Data Import Handler framework and am quite
impressed with it so far. My question is in trying to reduce the number of
SQL queries issued to the database and saw this entity processor.

In the following example:
<entity name="x" query="select * from x">
    <entity name="y" query="select * from y where xid=${x.id}"
processor="CachedSqlEntityProcessor">
    </entity>
<entity>

I like the concept of having multiple entity blocks for clarity but why
wouldn't I have (for DB efficiency), the following as one entity's SQL
statement "select * from X,Y where x.id=y.xid" and have two fields pointing
at X and Y columns?  My main question though is how the
CachedSQLEntityProcessor helps in this case for I want to use the multiple
entity blocks for cleanliness. If I have 500,000 X records, how many SQL
queries in the second entity block (y) would get executed, 500000?

If there is any more detailed information about the number of queries
executed in different circumstances, memory overhead or way that the data is
brought from the database into Java  it would be much appreciated for it's
important for my application.

Thanks in advance!
Amit
  • CachedSqlEntityProcess... Amit Nithian

Reply via email to