Actually, you may be able to get by using PatternReplaceCharFilterFactory - copy the source value to two fields, one that treats "<d2>.*</d2>" as the delimiter pattern to delete and then other uses "<d1>.*</d1>" as the delimiter pattern to delete, so the first field has only <d1> and then second has only <d2>. You can use a second pattern char filter to remove the "<[/]d[12>" markers as well, probably changing them to a space in both cases.
See: http://lucene.apache.org/core/4_10_3/analyzers-common/org/apache/lucene/analysis/pattern/PatternReplaceCharFilterFactory.html -- Jack Krupansky On Tue, Jan 13, 2015 at 11:40 AM, Jack Krupansky <jack.krupan...@gmail.com> wrote: > Would it be sufficient for your user case to simply extract all the <d1> > into one field and all the <d2> in another field? If so, the update > processor script would be very simple, simply matching all "<d1>.*</d1>" > and copying them to a separate field value and same for <d2>. > > If you want examples of script update processors, see my Solr e-book: > > http://www.lulu.com/us/en/shop/jack-krupansky/solr-4x-deep-dive-early-access-release-7/ebook/product-21203548.html > > -- Jack Krupansky > > On Tue, Jan 13, 2015 at 9:21 AM, tomas.kalas <kala...@email.cz> wrote: > >> Thanks Jack for your advice. Can you please explain me little more, how it >> works? From Apache Wiki it's not to clear for me. I can write some >> javaScript code when i want filtering some data ? In this case i have >> <d1>bla bla bla</d1> <d2> bla bla bla </d2> <d1>bla bla bla </d1> and i >> want >> filtering <d2> bla bla bla </d2>, But in other case i want filtering all >> <d1> .... </d1> then i suppose i used it at indexed data and filtering >> from >> them? Thanks >> >> >> >> -- >> View this message in context: >> http://lucene.472066.n3.nabble.com/Tokenizer-or-Filter-tp4178346p4179173.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> > >