We’re currently using copyField directives in our schema to copy the same text to different fields that use different analysers. For example, assuming the original field contained in the document payload sent to the update handler is called “tika_output", it is copied to “text”, “text_case_sensitive” and “text_accent_sensitive”.
In order to avoid inflating the size of the index, “tika_output" has indexed=false and stored=true, while “text” and friends have indexed=true and stored=false. We’re using the unified highlighter. I’ve read the code in UnifiedHighlighter.java, which clearly shows that the field to be highlighted must be stored. Therefore, searching on text_case_sensitive doesn’t yield highlighted results. Storing the field value redundantly would mean tripling my storage costs. I see that other people have brought up this issue before: https://issues.apache.org/jira/browse/SOLR-1105 https://issues.apache.org/jira/browse/SOLR-5276 Is there anything that can be done? If it comes down to subclassing the unified highlighter, does anyone have any recommendations for doing this?