Hi Vinci,

Maybe this answers most of your questions: Solr can't digest HTML - you have to 
do HTML parsing outside of Solr, and feed it a document with specific fields 
that match the schema.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ----
From: Vinci <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, March 25, 2008 4:25:10 PM
Subject: Fields, Facets and Indexing html document


Hi all,

I want to Solr to index my html document collection. After I read number of
tutorial and google search, I have some questions...
1. Can I index html document directly?
2. what should I do on the default schema.xml for indexing html documents?
3. Can fields to be defined by a combination of tag and attribute?
4. Does it possible to use Highlighter to filter the search result? (e.g. if
highlighting done after some marker tag, then the search result will get
lower ranking)
5. Can facets do a statistic on the search result?
6. Does facets have same meaning of fields? If not, what are there
different?
7. Can facets/feature defined in another document?

Thank you,
Vinci
-- 
View this message in context: 
http://www.nabble.com/Fields%2C-Facets-and-Indexing-html-document-tp16287762p16287762.html
Sent from the Solr - User mailing list archive at Nabble.com.




Reply via email to