Re: Data Import Request Handler isolated into its own project - any suggestions?

Marek Ščevlík Fri, 25 Nov 2016 10:31:41 -0800

Hi Daniel. Thanks for a reply. I wonder is it now still possibly with
release of Solr 6.3 to get hold of a running instance of the jetty server
that is part of the solution? I found some code for previous versions where
it was captured with this code and one could then obtain cores for a
running solr instance ...


SolrDispatchFilter solrDispatchFilter = (SolrDispatchFilter) jetty

.getDispatchFilter().getFilter();


I was trying to implement it this way but that is not working out very well
now. I cant seem to get the jetty server object for the running instance. I
tried several combinations but none seemed to work.

Can you perhaps point me in the right direction?

Perhaps you may know more than I do at the moment.


Any help would be great.


Thanks a lot
Regards Marek Scevlik



2016-11-18 15:53 GMT+01:00 Davis, Daniel (NIH/NLM) [C] <daniel.da...@nih.gov
>:

> Marek,
>
> I've wanted to do something like this in the past as well.  However, a
> rewrite that supports the same XML syntax might be better.   There are
> several problems with the design of the Data Import Handler that make it
> not quite suitable:
>
> - Not designed for Multi-threading
> - Bad implementation of XPath
>
> Another issue is that one of the big advantages of Data Import Handler
> goes away at this point, which is that it is hosted within Solr, and has a
> UI for testing within the Solr admin.
>
> A better open-source Java solution might be to connect Solr with Apache
> Camel - http://camel.apache.org/solr.html.
>
> If you are not tied absolutely to pure open-source, and freemium products
> will do, then you might look at Pentaho Spoon and Kettle.   Although Talend
> is much more established in the market, I find Pentaho's XML-based ETL a
> bit easier to integrate as a developer, and unit test and such.   Talend
> does better when you have a full infrastructure set up, but then the
> attention required to unit tests and Git integration seems over the top.
>
> Another powerful way to get things done, depending on what you are
> indexing, is to use LogStash and couple that with Document processing
> chains.   Many of our projects benefit from having a single RDBMS view,
> perhaps a materialized view, that is used for the index.   LogStash does
> just fine here, pulling from the RDBMS and posting each row to Solr.  The
> hierarchical execution of Data Import Handler is very nice, but this can
> often be handled on the RDBMS side by creating a view, maybe using
> functions to provide some rows.   Many RDBMS systems also support
> federation and the import of XML from files, so that this brings XML
> processing into the picture.
>
> Hoping this helps,
>
> Dan Davis, Systems/Applications Architect (Contractor),
> Office of Computer and Communications Systems,
> National Library of Medicine, NIH
>
>
>
>
> -----Original Message-----
> From: Marek Ščevlík [mailto:mscev...@codenameprojects.com]
> Sent: Friday, November 18, 2016 9:29 AM
> To: solr-user@lucene.apache.org
> Subject: Data Import Request Handler isolated into its own project - any
> suggestions?
>
> Hello. My name is Marek Scevlik.
>
>
>
> Currently I am working for a small company where we are interested in
> implementing your Sorl 6.3 search engine.
>
>
>
> We are hoping to take out from the original source package the Data Import
> Request Handler into its own project and create a usable .jar file out of
> it.
>
>
>
> It should then serve as tool that would allow to connect to a remote
> server and return data for us to our other application that would use the
> returned data.
>
>
>
> What do you think? Would anything like this possible? To isolate out the
> Data Import Request Handler into its own standalone project?
>
>
>
> If we could achieve this we won’t mind to share with the community this
> new feature.
>
>
>
> I realize this is a first email and may lead into several hundreds so for
> the start my request is very simple and not so high level detailed but I am
> sure you realize it may lead into being quite complex.
>
>
>
> So I wonder if anyone replies.
>
>
>
> Thanks a lot for any replies and further info or guidance.
>
>
>
>
>
> Thanks.
>
> Regards Marek Scevlik
>

Re: Data Import Request Handler isolated into its own project - any suggestions?

Reply via email to