I have a good 'documents.index' now, and searching works for all
doc-base terms except the entries that caused the out-of-range errors.
Indeed, the doc-base entries that caused the indexer to choke were 
those containing files that required compound filters to extract the
original format to be passed to pdftotext or pstotext, i.e *.ps.gz,
*.pdf.gz.  

        It looks like what needs to be done _can_ be done, but I am
a currently unsure about the exact way to add these to dhelp's swish+
+.conf. Per 'man swish++.conf':

        "A file can be filtered more than once prior to indexing or
extraction, i.e., filters can be ``chained'' together.  For example,
if the  uncompression  and  PDF  examples shown above are used
together, compressed PDF files will also be indexed or extracted, i.e.,
filenames ending with one of .pdf.bz2, .pdf.gz, or .pdf.Z double
extensions."

        Testing additonal filter statement(s) in '/usr/share/dhelp/
 

Reply via email to