With Solr you can index anything Lucene can index since Solr uses
Lucene under the cover.  The input to Solr is in XML format.  You
will need to process that data you want to index (ie exclude certain
files and remove HTML tags) and put them into Solr's input format.

Bill


On 4/26/06, David Trattnig <[EMAIL PROTECTED]> wrote:
>
> Hello!
>
> I'd like to setup/develop a search-server. I thought I would use Lucene,
> then I read about Solr. So I have done the Solr-Tutorial. Firstly really
> happy about the additional features to the Lucene-Functionality I now
> noticed that Solr can index only XML files. Or am I completely wrong?
>
> What should I use for the following situation:
>
> 1. Copy HTML-files to the Live-Server (via RSync)
> 2. Index them by the search engine
> 3. Exclude some "tagged" files (these files for example would have a
> specific meta-data-tag)
> 4. Exclude HTML-tags and other unworthy stuff
>
> How much work of development would that be with Lucene or Solr (If
> possible)?
>
> Any help would be appreciated!
>
> Thx in advance,
> david
>
>

Reply via email to