On Fri, Jul 10, 2009 at 5:56 PM, Michael _ <solrco...@gmail.com> wrote:

> Hello,
> I've got a stored, indexed field that contains some actual text, and some
> metainfo, like this:
>
>   one two three four [METAINFO] oneprime twoprime threeprime fourprime
>
> I have written a Tokenizer that skips past the [METAINFO] marker and uses
> the last four words as the tokens for the field, mapping to the first four
> words.  E.g. "twoprime" is the second token, with startposition=4 and
> endposition=8.
>
> When someone searches for "twoprime", therefore, they get back a
> highlighted
> result like
>
>   one <em>two</em> three ...
>
> This is great and serves my needs, but I hate that I'm storing all that
> METAINFO uselessly (there's actually a good deal more than in this
> simplified example).  After I've used it to make my tokens, I'd really like
> to convert the stored field to just
>
>   one two three four
>
> and store that.
>
> I thought about using an UpdateRequestProcessor to do this, but that
> happens
> *before* the Analyzers run, so if I strip the [METAINFO] there I can't use
> it to build my tokens.  I also thought about sending the data in in two
> fields, like
>
>   f1: one two three four
>   f1_meta: oneprime twoprime threeprime fourprime
>
> but I can't figure out a way for f1's analyzer to grab the stream from
> f1_meta.
>
> Is there some clever way that I'm missing to build my token stream outside
> of Solr, and store just the original text and index my token stream?
>

Can't you have two fields like this?

f1 (indexed, not stored) -> one two three four [METAINFO] oneprime twoprime
threeprime fourprime
f2 (not indexed, stored) -> one two three four

-- 
Regards,
Shalin Shekhar Mangar.

Reply via email to