Re: DIH: Create Child Documents in ScriptTransformer

Jörn Franke Thu, 19 Sep 2019 14:26:49 -0700

Hi,

thanks for all the feedback.
The context parameter in the ScriptTransformer is new to me - thanks for
this insight. I could not find it in any docs. So just for people that also
did not know it:
you can have the ScriptTransformer with 2 parameters, e.g.
function mytransformer(row,context){
....
}


The following Javadoc gives some hints on what you can do with the context:
https://lucene.apache.org/solr/8_2_0/solr-dataimporthandler/org/apache/solr/handler/dataimport/Context.html

Despite all this, I came to the conclusion that adding child docs in a
ScriptTransformer in DIH are not supported.

One can though use a StatelessScriptUpdateProcessFactory, see
https://lucene.apache.org/solr/8_2_0//solr-core/org/apache/solr/update/processor/StatelessScriptUpdateProcessorFactory.html

and

https://cwiki.apache.org/confluence/display/solr/ScriptUpdateProcessor#ScriptUpdateProcessor-JavaScript

Hint on how to add child documents to a SolrInputDocument:
http://lucene.apache.org/solr/8_2_0/solr-solrj/index.html?org/apache/solr/common/SolrInputDocument.html


Nevertheless, I agree that one should use an external tool, which depending
on the needs can though also mean some complexity (e.g. supporting
individual transformations per collection without code, but
configuration/plugins etc.). While this is not a problem, it might be good
to start an open source loader that goes beyond the post tool (
https://lucene.apache.org/solr/guide/8_1/post-tool.html).

best regards

On Thu, Sep 19, 2019 at 8:54 AM Mikhail Khludnev <m...@apache.org> wrote:

> Hello, Jörn.
> Have you tried to find a parent doc in the context which is passed as a
> second argument into ScriptTransformer?
>
> On Wed, Sep 18, 2019 at 9:56 PM Jörn Franke <jornfra...@gmail.com> wrote:
> >
> > Hi,
> >
> > I load a set of documents. Based on these documents some logic needs to
> be
> > applied to split them into chapters (this is done). One whole document is
> > loaded as a parent. Chapters of the whole document + metadata should be
> > loaded as child documents of this parent.
> > I want to now collect information on how this can be done:
> > * Use a custom loader - this is possible and works
> > * Use DIH and extract the chapters in a ScriptTransformer and add them as
> > child documents there. However, the scripttransformer receives as input
> > only a HashMap and while it works to transform field values etc. It does
> > not seem possible to add childdocuments within the DIH scripttransformer.
> I
> > tried adding a JavaArray with SolrInputDocuments, but this does not seem
> to
> > work. I see in debug/verbose mode that indeed the transformer adds them
> to
> > the HashMap correctly, but they don't end up in the document. Maybe here
> it
> > could be possible somehow via nested entities?
> > * Use DIH+ an UpdateProcessor (Script): there i get the SolrInputDocument
> > as a parameter and it seems feasible to extract chapters and add them as
> > child documents.
> >
> > thank you.
> >
> > best regards
>
>
>
> --
> Sincerely yours
> Mikhail Khludnev
>

Re: DIH: Create Child Documents in ScriptTransformer

Reply via email to