Another option for determining whether to go to external storage would
be to examine the SchemaField, see if it is stored, and if not, try to
fetch from a file or whatever. That way you won't have to configure
anything.
-Mike
On 06/20/2011 09:46 AM, Jamie Johnson wrote:
In my case chucking the external storage is simply not an option.
I'll definitely share anything I find, the following is a very simple
example of adding text to the default solr highlighter (had to copy a
large portion of the class since the method that actually does the
highlighting is private along with some classes to get this to run).
If you look at the source it should hopefully make sense.
String[] docTexts = null;
if(fieldName.equals("title")){
SchemaField keyField = schema.getUniqueKeyField();
String key = doc.getValues(keyField.getName())[0]; //I
know this field exists and is not multivalued
docTexts = doc.getValues(fieldName); //this would be
loaded from external store, but below just appends some information
if(key != null && key.length > 0){
for(int x = 0; x < docTexts.length; x++){
docTexts[x] = docTexts[x] + " some added text";
}
}
}
I have cheated since I know the name of the field that (title) which I
am doing this for but it would probably be useful to allow this to be
set on the highlighter class through configuration in solrconfig (I'm
not familiar at all with doing this and have spent 0 time looking into
it). Once configured the if(fieldName.equals("title")) line would be
replaced with something like
if(externalFields.contains(fieldName)){...} or something like that.
Thoughts/comments?
On Mon, Jun 20, 2011 at 9:05 AM, Mike Sokolov <soko...@ifactory.com
<mailto:soko...@ifactory.com>> wrote:
I'd be very interested in this, as well, if you do it before me
and are willing to share...
A related question I have tried to ask on this list, and have
never really gotten a good answer to, is whether it makes sense to
just chuck the external storage and treat the lucene index as the
primary storage for documents. I have a feeling the answer is no;
perhaps because of increased I/O costs for lucene and solr, but I
don't really know. I've been considering doing some
experimentation, but would really love an expert opinion...
-Mike
On 06/20/2011 08:41 AM, Jamie Johnson wrote:
I am trying to index data where I'm concerned that storing the
contents of a
specific field will be a bit of a hog so we are planning to
retrieve this
information as needed for highlighting from an external
source. I am
looking to extend the default solr highlighting capability to
work with
information pulled from this external source and it looks like
this is
possible by extending DefaultSolrHighlighter (line 418 to pull
a particular
field from external source) for standard highlighting and
BaseFragmentsBuilder (line 99) for FastVectorHighlighter. I
could just hard
code this to say if the field name is a specific value look
into the
external source, is this the best way to accomplish this? Are
there any
other extension points to do what I'm suggesting?