Another option for determining whether to go to external storage would be to examine the SchemaField, see if it is stored, and if not, try to fetch from a file or whatever. That way you won't have to configure anything.

-Mike

On 06/20/2011 09:46 AM, Jamie Johnson wrote:
In my case chucking the external storage is simply not an option. I'll definitely share anything I find, the following is a very simple example of adding text to the default solr highlighter (had to copy a large portion of the class since the method that actually does the highlighting is private along with some classes to get this to run). If you look at the source it should hopefully make sense.


        String[] docTexts = null;

        if(fieldName.equals("title")){

            SchemaField keyField = schema.getUniqueKeyField();
String key = doc.getValues(keyField.getName())[0]; //I know this field exists and is not multivalued docTexts = doc.getValues(fieldName); //this would be loaded from external store, but below just appends some information
            if(key != null && key.length > 0){
                for(int x = 0; x < docTexts.length; x++){
                    docTexts[x] = docTexts[x] + " some added text";
                }
            }
        }

I have cheated since I know the name of the field that (title) which I am doing this for but it would probably be useful to allow this to be set on the highlighter class through configuration in solrconfig (I'm not familiar at all with doing this and have spent 0 time looking into it). Once configured the if(fieldName.equals("title")) line would be replaced with something like if(externalFields.contains(fieldName)){...} or something like that.

Thoughts/comments?

On Mon, Jun 20, 2011 at 9:05 AM, Mike Sokolov <soko...@ifactory.com <mailto:soko...@ifactory.com>> wrote:

    I'd be very interested in this, as well, if you do it before me
    and are willing to share...

    A related question I have tried to ask on this list, and have
    never really gotten a good answer to, is whether it makes sense to
    just chuck the external storage and treat the lucene index as the
    primary storage for documents.  I have a feeling the answer is no;
    perhaps because of increased I/O costs for lucene and solr, but I
    don't really know.  I've been considering doing some
    experimentation, but would really love an expert opinion...

    -Mike


    On 06/20/2011 08:41 AM, Jamie Johnson wrote:

        I am trying to index data where I'm concerned that storing the
        contents of a
        specific field will be a bit of a hog so we are planning to
        retrieve this
        information as needed for highlighting from an external
        source.  I am
        looking to extend the default solr highlighting capability to
        work with
        information pulled from this external source and it looks like
        this is
        possible by extending DefaultSolrHighlighter (line 418 to pull
        a particular
        field from external source) for standard highlighting and
        BaseFragmentsBuilder (line 99) for FastVectorHighlighter.  I
        could just hard
        code this to say if the field name is a specific value look
        into the
        external source, is this the best way to accomplish this?  Are
        there any
        other extension points to do what I'm suggesting?



Reply via email to