Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

Koji Sekiguchi Mon, 11 Jul 2011 07:35:43 -0700

Sowmya,

The combination of fieldNameFeature and dynamicField can be used when using,
e.g. named entity extractor that tend to produce a lot of attributes, 
organization,
location, country, building, spot, title,... If you are going to use such named
entity extractor, you don't want to define each field in schema.xml, you may
want to use a dynamic field *_sm (multiValued string type) instead.
And you want solr to map organization to organization_sm, location to 
location_sm,
and so on. You can do it via having fieldNameFeature and dynamicField.


Where "name" feature of fieldNameFeature value is used for field name in 
dynamicField.

koji
--
http://www.rondhuit.com/en/

(11/07/11 21:54), Sowmya V.B. wrote:

Hi Koji

Thanks a lot for the examples. Now, I was able to compile a JAR snapshot,
with my own UIMA pipeline. However, despite seeing the example
solrconfig.xml, I am not able to figure out how to add mine.

In the example:

   <str name="feature">entity</str>

             <str name="fieldNameFeature">name</str>

             <str name="dynamicField">*_sm</str>

I still don't understand what "fieldnamefeature" mean, in case of dynamic
fields.

For example, if the annotator takes "text" field, and gives "fieldA, fieldB,
fieldC", how should I specify that inside this?

I was looking on the Solr pages, and on the SolrUIMA page, (
http://wiki.apache.org/solr/SolrUIMA#Using_other_UIMA_components)
There is this example configuration, for fieldmapping specification:

<fieldMapping>
     <!-- here goes the mapping between features of UIMA
FeatureStructures to Solr fields  -->
     <type name="org.apache.uima.something.Annotation">
       <map feature="oneFeature" field="destination_field"/>
     </type>
     ...
   </fieldMapping>


Which is slightly different from the example that you used in rondhuit code
samples.
So, does it mean - I can also do something like:
<fieldMapping>
     <type name = "org.apache.uima.annotators.tagger">
             <map feature="text" field "text">
     </type>
<!-- Because the annotator "tagger" does not create any new fields in the
index. It just modifies the text field -->

         <type name = "org.apache.uima.annotators.stats">
             <map feature="FieldA" field "FieldX">
            <map feature="FieldB" field "FieldY">
             <map feature="FieldC" field "FieldZ">
     </type>
<!-- Where, Fields X,Y,Z are declared in Schema. Fields A, B, C were
obtained inside the "stats" annotator. -->

</fieldMapping>
-if I add Fields from the annotator from within the pipeline, using
addFStoIndexes() method?

Sowmya.

On Sat, Jul 9, 2011 at 12:51 AM, Koji Sekiguchi<k...@r.email.ne.jp>  wrote:

Now I've pasted sample solrconfig.xml to the project top page.
Can you visit and look at it again?


koji
--
http://www.rondhuit.com/en/

(11/07/09 2:29), Sowmya V.B. wrote:

Hi Koji

Thanks. I have checked out the code and began looking at it. The code
examples gave me an idea of what to do,though I am not fully clear, since
there are no comments there, to verify my understanding. Hence, mailing
again for clarification.

In NamedEntity.java, you add two fields "name", "entity", to the index,
via
this processing pipeline "next"?
the methods setName() and setEntity() - add two fields "name", "entity",
to
the index?

If so, how should I specify this in the solrconfig.xml's<**fieldMappings>
section?

<lst name="type">
             <str name="name">next.NamedEntity</**str>
             <lst name="mapping">
               <str name="feature">name</str>
               <str name="field">namefield</str>   (where namefield is
the field I declared in schema.xml, say)
             </lst>
           </lst>
           <lst name="type">
             <str name="name">next.NamedEntity</**str>
             <lst name="mapping">
               <str name="feature">entity</str>
               <str name="field">entityfield</str>   (where entityfield
is the field I declared in schema.xml, say)
             </lst>
           </lst>

- Is this the right way to go? Can I declare 2 mappings which relate to
the
same class (next.NamedEntity, in this case)?

I am sorry for repeated mails...but its a bit confusing, because there is
no
README file.
Thankyou once again!

Sowmya.

On Fri, Jul 8, 2011 at 4:07 PM, Koji Sekiguchi<k...@r.email.ne.jp>
  wrote:

  (11/07/08 16:19), Sowmya V.B. wrote:


  Hi Koji


Thanks for the mail.

Thanks for all the clarifications. I am now using the version 3.3.. But,
another query that I have about this is:
How can I add an annotator that I wrote myself, in to Solr-UIMA?

Here is what I did before I moved to Solr:
I wrote an annotator (which worked when I used plain vanilla lucene
based
indexer), which enriched the document with more fields (Some statistics
about the document...all fields added were numeric fields). Those fields
were added to the index by extending *JCasAnnotator_ImplBase* class.

But, in Solr-UIMA, I am not exactly clear on where the above setup fits
in.
I thought I would get an idea looking at the annotators that came with
the
UIMA integration of Solr, but their source was not available. So, I do
not
understand how to actually integrate my own annotator in to UIMA.

Hi Sowmya,

Please look at an example UIMA annotators that can be deployed on
Solr-UIMA
environment:

http://code.google.com/p/****rondhuit-uima/<http://code.google.com/p/**rondhuit-uima/>
<http://code.**google.com/p/rondhuit-uima/<http://code.google.com/p/rondhuit-uima/>



It comes with source code.


koji
--
http://www.rondhuit.com/en/

Re: (Solr-UIMA) Doubt regarding integrating UIMA in to solr - Configuration.

Reply via email to