I suggest you to look at here:
http://www.javadocexamples.com/java_source/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerTest.java.html


2013/10/4 Ken Krugler <kkrugler_li...@transpac.com>

> Hi all,
>
> Where's the documentation on the WikipediaTokenizer?
>
> Specifically I'm wondering how pieces from the source XML get mapped to
> field names in the Solr schema.
>
> For example, <revision><timestamp> seems to be going into the "date" field
> for an example schema I've got.
>
> And <revision><text> goes into "body".
>
> But is there any way to get <revision><contributor><username>, for example?
>
> Thanks,
>
> -- Ken
>
> --------------------------
> Ken Krugler
> +1 530-210-6378
> http://www.scaleunlimited.com
> custom big data solutions & training
> Hadoop, Cascading, Cassandra & Solr
>
>
>
>
>
>

Reply via email to