Re: WikipediaTokenizer documentation

2013-10-04 Thread Jack Krupansky
to keep only some token types. Besides my book, the best reference is going to be... the source code. -- Jack Krupansky -Original Message- From: Ken Krugler Sent: Thursday, October 03, 2013 9:03 PM To: solr-user@lucene.apache.org Subject: WikipediaTokenizer documentation Hi all

Re: WikipediaTokenizer documentation

2013-10-04 Thread Furkan KAMACI
I suggest you to look at here: http://www.javadocexamples.com/java_source/org/apache/lucene/wikipedia/analysis/WikipediaTokenizerTest.java.html 2013/10/4 Ken Krugler > Hi all, > > Where's the documentation on the WikipediaTokenizer? > > Specifically I'm wondering how pieces from the source XML

WikipediaTokenizer documentation - never mind

2013-10-03 Thread Ken Krugler
Hi all, Sorry for the noise - I finally realized that the script I was running was using some Java code (EnwikiContentSource, from Lucene benchmark) to explicitly set up fields and then push the results to Solr. -- Ken == Where's

WikipediaTokenizer documentation

2013-10-03 Thread Ken Krugler
Hi all, Where's the documentation on the WikipediaTokenizer? Specifically I'm wondering how pieces from the source XML get mapped to field names in the Solr schema. For example, seems to be going into the "date" field for an example schema I've got. And goes into "body". But is there any w