Hi Erik (& Shawn),

On Mar 31, 2014, at 1:48pm, Shawn Heisey <s...@elyograg.org> wrote:

> On 3/31/2014 2:36 PM, Erik Hatcher wrote:
>> Not currently possible.  Solr’s SchemaCodecFactory only has a hook for 
>> postings format (and doc values format).

OK, thanks for confirming.

> Would it be a reasonable thing to develop a config structure (probably in 
> schema.xml) that starts with something like <codec name="foo"> and has ways 
> to specify the class and related configuration for each of the components in 
> the codec? Then you could specify codec="foo" on an individual field 
> definition.  The codec definition could allow one of them to have 
> default="true".
> 
> I will admit that my understanding of these Lucene-level details is low, so I 
> could be thinking about this wrong.

The absolute easiest approach would be to support a new init value for 
codecFactory, which SchemaCodecFactory would use to select a different base 
codec class to use (versus always using Lucene<version>Codec). That would 
switch everything to a different codec.

Or you could extend the SchemaCodecFactory to support additional per-field 
settings for stored fields format, etc beyond what's currently available.

For my quick & dirty hack I've specified a different codecFactory in 
solrconfig.xml, and have my own factory that hard-codes the SimpleTextCodec.

This works - all files are in the SimpleTextXXX format, other than the 
segments.gen and segments_XX files; what, those aren't pluggable?!?! :)

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://www.scaleunlimited.com
custom big data solutions & training
Hadoop, Cascading, Cassandra & Solr





Reply via email to