Re: Extending Solr Highlighter to pull information from external source

Jamie Johnson Fri, 15 Jul 2011 14:09:00 -0700

I tried the patch at SOLR-1397 but it didn't work as I'd expect.

<lst name="highlighting">
    <lst name="1">
        <arr name="subject_phonetic">
            <str><em>Test</em> subject message</str>
        </arr>
        <arr name="subject_phonetic_startPos"><int>0</int></arr>
        <arr name="subject_phonetic_endPos"><int>29</int></arr>
    </lst>
</lst>
The start position is right, but the end position seems to be the
length of the field.



On Fri, Jul 15, 2011 at 4:25 PM, Jamie Johnson <jej2...@gmail.com> wrote:
> I added the highlighting code I am using to this JIRA
> (https://issues.apache.org/jira/browse/SOLR-1397).  Afterwards I
> noticed this JIRA (https://issues.apache.org/jira/browse/SOLR-1954)
> which talks about another solution.  I think David's patch would have
> worked equally well for my problem, just would require later doing the
> highlighting on the clients end.  I'll have to give this a whirl over
> the weekend.
>
> On Fri, Jul 15, 2011 at 3:55 PM, Jamie Johnson <jej2...@gmail.com> wrote:
>> Boy it's been a long time since I first wrote this, sorry for the delay....
>>
>> I think I have this working as I expect with a test implementation.  I
>> created the following interface
>>
>> public interface SolrExternalFieldProvider extends 
>> NamedListInitializedPlugin {
>>        public String[] getFieldContent(String key, SchemaField field,
>> SolrQueryRequest request);
>> }
>>
>> I then added to DefaultSolrHighlighter the following:
>>
>> in init()
>>
>> SolrExternalFieldProvider defaultProvider =
>> solrCore.initPlugins(info.getChildren("externalFieldProvider") ,
>> externalFieldProviders,SolrExternalFieldProvider.class,null);
>>            if(defaultProvider != null){
>>                externalFieldProviders.put("", defaultProvider);
>>                externalFieldProviders.put(null, defaultProvider);
>>            }
>> then in doHighlightByHighlighter I added the following
>>
>> if(schemaField != null && !schemaField.stored()){
>>                        SolrExternalFieldProvider externalFieldProvider =
>> this.getExternalFieldProvider(fieldName, params);
>>                        if(externalFieldProvider != null){
>>                    SchemaField keyField = schema.getUniqueKeyField();
>>                    String key = doc.getValues(keyField.getName())[0];  //I
>> know this field exists and is not multivalued
>>                    if(key != null && key.length() > 0){
>>                        docTexts = externalFieldProvider.getFieldContent(key,
>> schemaField, req);
>>                    }
>>                        } else {
>>                                docTexts = new String[]{};
>>                        }
>>                }
>>
>>                else {
>>                docTexts = doc.getValues(fieldName);
>>        }
>>
>>
>> This worked for me.  I needed to include the req because there are
>> some additional thing that I need to have from it, I figure this is
>> probably something else folks will need as well.  I tried to follow
>> the pattern used for the other highlighter pieces in that you can have
>> different externalFieldProviders for each field.  I'm more than happy
>> to share the actual classes with the community or add them to one of
>> the JIRA issues mentioned below, I haven't done so yet because I don't
>> know how to build patches.
>>
>> On Mon, Jun 20, 2011 at 11:47 PM, Michael Sokolov <soko...@ifactory.com> 
>> wrote:
>>> I found https://issues.apache.org/jira/browse/SOLR-1397 but there is not
>>> much going on there
>>>
>>> LUCENE-1522 <https://issues.apache.org/jira/browse/LUCENE-1522>has a lot of
>>> fascinating discussion on this topic though
>>>
>>>
>>>> There is a couple of long lived issues in jira for this (I'd like to try
>>>> to search
>>>> them, but I couldn't access jira now).
>>>>
>>>> For FVH, it is needed to be modified at Lucene level to use external data.
>>>>
>>>> koji
>>>
>>> Koji - is that really so?  It appears to me that would could extend
>>> BaseFragmentsBuilder and override
>>>
>>> createFragments(IndexReader reader, int docId,
>>>      String fieldName, FieldFragList fieldFragList, int maxNumFragments,
>>>      String[] preTags, String[] postTags, Encoder encoder )
>>>
>>> providing a version that retrieves text from some external source rather
>>> than from Lucene fields.
>>>
>>> It sounds to me like a really useful modification in Lucene core would be to
>>> retain match points that have already been computed during scoring so the
>>> highlighter doesn't have to attempt to reinvent all that logic!  This has
>>> all been discussed at length in LUCENE-1522 already, but is there is any
>>> recent activity?
>>>
>>> My hope is that since (at least in my test) search code seems to spend 80%
>>> of its time highlighting, folks will take up this banner and do the plumbing
>>> needed to improve it - should lead to huge speed-ups for searching!  I'm
>>> continuing to read, but not really capable of making a meaningful
>>> contribution at this point.
>>>
>>> -Mike
>>>
>>
>

Re: Extending Solr Highlighter to pull information from external source

Reply via email to