Thank you Erick for your answer. I read your post and I found it very
interesting.
Unfortunately it is not suitable for my use case:
* security is not an issue, since the dbs will be fully replicated in
the same infrastructure.
* there are no bazillion of data (something like 300K html documents)
* if I choose client side approach, I'd have to write twice (Solr
index is a merge of 2 dbs).
* I'd like to pull data from Solr unless it is absolutely impossible
(that was the reason I chose Solr over Lucene).
* least but not last, ATM my real issue is to found a reusable
solution to index hierarchical data (unless it already exists).


Twitter     :http://www.twitter.com/m_cucchiara
G+          :https://plus.google.com/107903711540963855921
Linkedin    :http://www.linkedin.com/in/mauriziocucchiara
VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara

Maurizio Cucchiara


On 3 October 2012 14:06, Erick Erickson <erickerick...@gmail.com> wrote:
> Maurizio:
>
> DIH is great for its intended purpose, but when things get complex I generally
> prefer writing something in SolrJ, it gives much finer-grained control
> over "special circumstances". Plus, you can see everything that
> happens. Here's a blog with a skeletal SolrJ program, you can just
> pull out all the local-tika stuff.
>
> http://searchhub.org/dev/2012/02/14/indexing-with-solrj/
>
> The take-away IMO is that once you've spent some time working with
> DIH without getting what you need, something like using an independent
> client (SolrJ in this example) is worth considering..
>
> Best
> Erick
>
> On Tue, Oct 2, 2012 at 12:59 PM, Maurizio Cucchiara
> <mcucchi...@apache.org> wrote:
>> Hi all,
>> I'm trying to import some hierarchical data (stored in MySQL) on Solr,
>> using DataImportHandler.
>> Unfortunately, as most of you already knows, MySQL has no support for
>> recursive queries, so there is no way to get hierarchical data stored
>> as an adjacency list.
>> So I considered writing a DIH custom transformers which given a
>> specified sql (like select * from categories) and a value (f.e.
>> category_id):
>> * fetches all data
>> * builds an hierarchical representation of the fetched data
>> * optionally caches the hierarchical data structure
>> * then returns 2 multi-valued lists which contain the 2 full paths (as
>> String and as Number)
>>
>> Is there something out of the box?
>> Alternatively, does the above approach sound good?
>>
>> TIA
>>
>>
>> Twitter     :http://www.twitter.com/m_cucchiara
>> G+          :https://plus.google.com/107903711540963855921
>> Linkedin    :http://www.linkedin.com/in/mauriziocucchiara
>> VisualizeMe: http://vizualize.me/maurizio.cucchiara?r=maurizio.cucchiara
>>
>> Maurizio Cucchiara

Reply via email to