Actually, you may be able to get by using PatternReplaceCharFilterFactory -
copy the source value to two fields, one that treats "<d2>.*</d2>" as the
delimiter pattern to delete and then other uses "<d1>.*</d1>" as the
delimiter pattern to delete, so the first field has only <d1> and then
second has only <d2>. You can use a second pattern char filter to remove
the "<[/]d[12>" markers as well, probably changing them to a space in both
cases.

See:
http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceCharFilterFactory.html

-- Jack Krupansky

On Tue, Jan 13, 2015 at 11:40 AM, Jack Krupansky <jack.krupan...@gmail.com>
wrote:

> Would it be sufficient for your user case to simply extract all the <d1>
> into one field and all the <d2> in another field? If so, the update
> processor script would be very simple, simply matching all "<d1>.*</d1>"
> and copying them to a separate field value and same for <d2>.
>
> If you want examples of script update processors, see my Solr e-book:
>
> http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html
>
> -- Jack Krupansky
>
> On Tue, Jan 13, 2015 at 9:21 AM, tomas.kalas <kala...@email.cz> wrote:
>
>> Thanks Jack for your advice. Can you please explain me little more, how it
>> works? From Apache Wiki it's not to clear for me. I can write some
>> javaScript code when i want filtering some data ? In this case i have
>> <d1>bla bla bla</d1> <d2> bla bla bla </d2> <d1>bla bla bla </d1> and i
>> want
>> filtering <d2> bla bla bla </d2>, But in other case i want filtering all
>> <d1> .... </d1> then i suppose i used it at indexed data and filtering
>> from
>> them? Thanks
>>
>>
>>
>> --
>> View this message in context:
>> http://lucene.472066.n3.nabble.com/Tokenizer-or-Filter-tp4178346p4179173.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>
>

Reply via email to