Re: Indexing content, storing html

2008-02-22 Thread Paul deGrandis
Thanks, this is perfect for what I'm trying to do. Paul On 2/22/08, Reece <[EMAIL PROTECTED]> wrote: > Well I don't remember the specific name of it, I just wrote that > because it sounded close :) > > There is a list of them here though: > http://wiki.apache.org/solr/AnalyzersTokenizersTokenF

Re: Indexing content, storing html

2008-02-22 Thread Reece
Well I don't remember the specific name of it, I just wrote that because it sounded close :) There is a list of them here though: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters -Reece On Fri, Feb 22, 2008 at 2:10 PM, Paul deGrandis <[EMAIL PROTECTED]> wrote: > Thanks! > > Does So

Re: Indexing content, storing html

2008-02-22 Thread Paul deGrandis
Thanks! Does Solr include an HTMLTokenFilterFactory? Paul On 2/22/08, Reece <[EMAIL PROTECTED]> wrote: > I did this as well, but found problems when searching (tags in between > words caused searching nightmares). I recommend stripping out all the > tags using the HTMLTokenFilterFactory or yo

Re: Indexing content, storing html

2008-02-22 Thread Reece
I did this as well, but found problems when searching (tags in between words caused searching nightmares). I recommend stripping out all the tags using the HTMLTokenFilterFactory or your own regex when indexing, and storing the actual HTML in an actual database. If you really want to store the HT

Indexing content, storing html

2008-02-22 Thread Paul deGrandis
Hi all, I'm working on a solr app that pulls HTML from an embedded JavaScript WYSIWYG editor, and I need to index on the content, but store and reproduce the HTML. The problem I have is when I try to add and commit, the HTML gets interpreted as XML. Is the way to do this properly to create an HT