Hi Mikhail,

Thanks for your help.  There are many improvements in Solr 5.x that we will 
take advantage of once we migrate.  For now we are on 4.6.

Thanks,

Carl  Buxbaum
Software Architect
TradeStone Software
17 Rogers St. Suite 2; Gloucester, MA 01930
P: 978-515-5128 F : 978-281-0673
www.tradestonesoftware.com<http://www.tradestonesoftware.com/>

Connect with us on 
Twitter<http://twitter.com/TradeStone>/LinkedIn<http://www.linkedin.com/groups?gid=3118854&trk=myg_ugrp_ovr>/Facebook<http://www.facebook.com/home.php?#!/tradestonesoftware?ref=ts>


From: Mikhail Khludnev [via Lucene] 
[mailto:ml-node+s472066n421911...@n3.nabble.com]
Sent: Friday, July 24, 2015 4:00 PM
To: Carl Buxbaum <cbuxb...@tradestonesoftware.com>
Subject: Re: caceh implemetation?

On Fri, Jul 24, 2015 at 1:06 AM, Shawn Heisey <[hidden 
email]</user/SendEmail.jtp?type=node&node=4219114&i=0>> wrote:

> On 7/23/2015 10:55 AM, cbuxbaum wrote:
> > Say we have 1000000 party records.  Then the child SQL will be run
> 1000000
> > times (once for each party record).  Isn't there a way to just run the
> child
> > SQL on all of the party records at once with a join, using a GROUP BY and
> > ORDER BY on the PARTY_ID?  Then the results from that query could easily
> be
> > placed in SOLR according to the primary key (party_id).  Is there some
> part
> > of the Data Import Handler that operates that way?
>
> Using well-crafted SQL JOIN is almost always going to be better for
> dataimport than nested entities.  The heavy lifting is done by the
> database server, using code that's extremely well-optimized for that
> kind of lifting.  Doing what you describe with a parent entity and one
> nested entity (that is not cached) will result in 1000001 total SQL
> queries.  A million SQL queries, no matter how fast each one is, will be
> slow.
>
> If you can do everything in a single SQL query with JOIN, then Solr will
> make exactly one SQL query to the server for a full-import.
>
> For my own dataimport, I use a view that was defined on the mysql server
> by the dbadmin.  The view does all the JOINs we require.
>
> Solr's dataimport handler doesn't have any intelligence to do the join
> locally.  It would be cool if it did, but somebody would have to write
> the code to teach it how.  Because the DB server itself can already do
> JOINs, and it can do them VERY well, there's really no reason to teach
> it to Solr.
>

fwiw, DIH now has join=”zipper”
<https://issues.apache.org/jira/browse/SOLR-4799> attribute which can be
specified to child entity, it enables classic ETL external merge join
algorithm.


> Thanks,
> Shawn
>
>


--
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

<http://www.griddynamics.com>
<[hidden email]</user/SendEmail.jtp?type=node&node=4219114&i=1>>

________________________________
If you reply to this email, your message will be added to the discussion below:
http://lucene.472066.n3.nabble.com/cache-implemetation-tp4218825p4219114.html
To unsubscribe from cache implemetation?, click 
here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4218825&code=Y2J1eGJhdW1AdHJhZGVzdG9uZXNvZnR3YXJlLmNvbXw0MjE4ODI1fC0xNDQ2Mjc3MTI2>.
NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>

DISCLAIMER: 
E-mails and attachments from TradeStone Software, Inc. are confidential.
If you are not the intended recipient, please notify the sender immediately by
replying to the e-mail, and then delete it without making copies or using it
in any way. No representation is made that this email or any attachments are
free of viruses. Virus scanning is recommended and is the responsibility of
the recipient.




--
View this message in context: 
http://lucene.472066.n3.nabble.com/cache-implemetation-tp4218825p4219275.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to