Hi Vinci, Maybe this answers most of your questions: Solr can't digest HTML - you have to do HTML parsing outside of Solr, and feed it a document with specific fields that match the schema.
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- From: Vinci <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, March 25, 2008 4:25:10 PM Subject: Fields, Facets and Indexing html document Hi all, I want to Solr to index my html document collection. After I read number of tutorial and google search, I have some questions... 1. Can I index html document directly? 2. what should I do on the default schema.xml for indexing html documents? 3. Can fields to be defined by a combination of tag and attribute? 4. Does it possible to use Highlighter to filter the search result? (e.g. if highlighting done after some marker tag, then the search result will get lower ranking) 5. Can facets do a statistic on the search result? 6. Does facets have same meaning of fields? If not, what are there different? 7. Can facets/feature defined in another document? Thank you, Vinci -- View this message in context: http://www.nabble.com/Fields%2C-Facets-and-Indexing-html-document-tp16287762p16287762.html Sent from the Solr - User mailing list archive at Nabble.com.