Is there also a way we can include some kind of annotation on the schema
field and send the data retrieved for that field to an external application.
We have a requirement where we require some data fields (out of the fields
for an entity defined in data-config.xml) to act as entities for entity
extraction and auto complete purposes and we are using some external
application.


Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> writing to a remote Solr through SolrJ is in the cards. I may even
> take it up after 1.4 release. For now your best bet is to override the
> class SolrWriter and override the corresponding methods for
> add/delete.
> 
>>> 2009/4/27 Amit Nithian <anith...@gmail.com>:
>>> > All,
>>> > I have a few questions regarding the data import handler. We have some
>>> > pretty gnarly SQL queries to load our indices and our current loader
>>> > implementation is extremely fragile. I am looking to migrate over to
>>> the
>>> > DIH; however, I am looking to use SolrJ + EmbeddedSolr + some custom
>>> stuff
>>> > to remotely load the indices so that my index loader and main search
>>> engine
>>> > are separated.
>>> > Currently, unless I am missing something, the data gathering from the
>>> entity
>>> > and the data processing (i.e. conversion to a Solr Document) is done
>>> > sequentially and I was looking to make this execute in parallel so
>>> that I
>>> > can have multiple threads processing different parts of the resultset
>>> and
>>> > loading documents into Solr. Secondly, I need to create temporary
>>> tables
>>> to
>>> > store results of a few queries and use them later for inner joins was
>>> > wondering how to best go about this?
>>> >
>>> > I am thinking to add support in DIH for the following:
>>> > 1) Temporary tables (maybe call it temporary entities)? --Specific
>>> only
>>> to
>>> > SQL though unless it can be generalized to other sources.
>>> > 2) Parallel support
>>> >  - Including some mechanism to get the number of records (whether it
>>> be
>>> > count or the MAX(custom_id)-MIN(custom_id))
>>> > 3) Support in DIH or Solr to post documents to a remote index (i.e.
>>> create a
>>> > new UpdateHandler instead of DirectUpdateHandler2).
>>> >
>>> > If any of these exist or anyone else is working on this (OR you have
>>> better
>>> > suggestions), please let me know.
>>> >
>>> > Thanks!
>>> > Amit
>>> >
>>>
>>>
>>>
>>> --
>>>
>>> -
>>>
>>
> 
> 
> 
> -- 
> --Noble Paul
> 
> 

-- 
View this message in context: 
http://old.nabble.com/DataImportHandler-Questions-Load-data-in-parallel-and-temp-tables-tp23266396p26371403.html
Sent from the Solr - User mailing list archive at Nabble.com.

  • Re: DataImportHandler ... amitj

Reply via email to