Re: Improving Readability of Hit Highlighting

2009-01-12 Thread Otis Gospodnetic
:07:57 PM > Subject: Re: Improving Readability of Hit Highlighting > > To answer your questions specifically, here is an example of the raw OCR > output; > > "CONTRACTORINMPRIMENTAYIVE : mom Ale ACCEPT INFORMATIONON TOUR SHEET TO ea" > > to which I would like

Re: Improving Readability of Hit Highlighting

2009-01-12 Thread Terence Gannon
To answer your questions specifically, here is an example of the raw OCR output; "CONTRACTORINMPRIMENTAYIVE : mom Ale ACCEPT INFORMATIONON TOUR SHEET TO ea" to which I would like to see; "mom ale access tour sheet to" in the hit highlight. My schema for this field is pretty much standard, as f

Re: Improving Readability of Hit Highlighting

2009-01-12 Thread Otis Gospodnetic
I'm not sure if I have a good suggestion, but I have a question. :) What is considered "junk"? Would it be possible to eliminate the junk before it even goes into the index in order to avoid GIGO (Garbage In Garbage Out)? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ---