all good ideas and recs, guys. erick, i'd thought of much the same after
reading through the SolrJ post and beginning to get a bit anxious at the
idea of implementation (not a java dev here lol). we're already doing some
processing before the import, taking a few million records, rolling them up
/
Having more carefully read Erick's post - I see that is essentially what he
said in a much more straightforward way.
I will also second Erick's suggestion of hammering on the SQL. We found
that fruitful many times at the same gig. I develop and am not a SQL
master. In a similar situation I'll u
It may or may not be helpful, but there's a similar class of problem that
is frequently solved either by stored procedures or by running the query on
a time-frame and storing the results... Doesn't matter if the end-point
for the data is Solr or somewhere else.
The problem is long running queries
Forgot to add... sometimes really hammering at the SQL query in DIH
can be fruitful, can you make a huge, monster query that's faster than
the sub-queries?
I've also seen people run processes on the DB that move all the
data into a temporary place making use of all of the nifty stuff you
can do th
oo gotcha. cool, will make sure to check it out and bounce any related
questions through here.
thanks!
best,
--
*John Blythe*
Product Manager & Lead Developer
251.605.3071 | j...@curvolabs.com
www.curvolabs.com
58 Adams Ave
Evansville, IN 47713
On Thu, May 26, 2016 at 1:45 PM, Erick Erickso
Solr commits aren't the issue I'd guess. All the time is
probably being spent getting the data from MySQL.
I've had some luck writing to Solr from a DB through a
SolrJ program, here's a place to get started:
searchhub.org/2012/02/14/indexing-with-solrj/
you can peel out the Tika bits pretty easily