Re: WikipediaTokenizer for Removing Unnecesary Parts

2013-07-23 Thread Furkan KAMACI
Here is my fieldtype: My input for indexing at analysis section of Solr admin page: {| style="text-align: left; width: 50%; table-layout: fixed;"

Re: WikipediaTokenizer for Removing Unnecesary Parts

2013-07-23 Thread Jack Krupansky
Are you actually seeing that output from the WikipediaTokenizerFactory?? Really? Even if you use the Solr Admin UI analysis page? You should just see the text tokens plus the URLs for links. -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Tuesday, July 23, 2013 10:53 A

Re: WikipediaTokenizer for Removing Unnecesary Parts

2013-07-23 Thread Robert Muir
If you use wikipediatokenizer it will tag different wiki elements with different types (you can see it in the admin UI). so then followup with typetokenfilter to only filter the types you care about, and i think it will do what you want. On Tue, Jul 23, 2013 at 7:53 AM, Furkan KAMACI wrote: > Hi