NGram and word boundaries?

2010-09-20 Thread Harry Hochheiser
I've got a question regarding NGramFilterFactory. It seems to work very well, but I've had trouble getting it to work with other filters. Specifically, if I have an index analyzer that uses a StandardTokenizerFactory to tokenize and follows it up with an NGramFilterFactory, it does a fine job of

Re: Indexing and ExtractingRequestHandler

2010-08-11 Thread Harry Hochheiser
tect > Cominvent AS - www.cominvent.com > Training in Europe - www.solrtraining.com > > On 11. aug. 2010, at 23.33, Harry Hochheiser wrote: > >> I'm trying to use Solr to index the contents of an Excel file, using >> the ExtractingRequestHandler (CSV handler won't work fo

Indexing and ExtractingRequestHandler

2010-08-11 Thread Harry Hochheiser
I'm trying to use Solr to index the contents of an Excel file, using the ExtractingRequestHandler (CSV handler won't work for me - I need to consider the whole spreadsheet as one document), and I'm running into some trouble. Is there any way to see what's going on during the indexing process? I'm

help with tokenizer/filter

2010-08-06 Thread Harry Hochheiser
Relatively new to solr, and I'm having trouble with indexing some fields coming out of the solr cell extraction handler. First question - what does the extraction handler do with text? For example, if i throw it an excel file, what am I going to get back as input to solr processing? is anything do