Hello!
I'd like to setup/develop a search-server. I thought I would use Lucene,
then I read about Solr. So I have done the Solr-Tutorial. Firstly really
happy about the additional features to the Lucene-Functionality I now
noticed that Solr can index only XML files. Or am I completely wrong?
What
David,
Solr doesn't index XML files, but rather XML is used as the wrapper
of the text that does get indexed. The document structure is defined
in schema.xml, and the field text to be indexed is sent wrapped in an
XML request.
Regarding your scenario, you would need to write code that pa
With Solr you can index anything Lucene can index since Solr uses
Lucene under the cover. The input to Solr is in XML format. You
will need to process that data you want to index (ie exclude certain
files and remove HTML tags) and put them into Solr's input format.
Bill
On 4/26/06, David Tratt
: will need to process that data you want to index (ie exclude certain
: files and remove HTML tags) and put them into Solr's input format.
minor clarification: Solr does ship with two Tokenizers that do a pretty
good job of throwing away HTML markup, os you don't have to parse it
yourlsef -- but
Hi,
Suppose I want the xml input submitted to solr to be distributed among a
fixed set of partitions; basically, something like round-robin among each of
them, so that each directory has a relatively equal size in terms of # of
segments. Is there an easy way to do this? I took a quick look at th