Re: Storing, indexing and searching XML documents in Solr

2011-05-18 Thread Erick Erickson
You're right, you can't store an XML document directly in Solr. You have to pull it apart and index it such that you can get whatever information back you need. How you flatten data depends entirely upon your needs. The high-level idea is that you want to create fields such that text searches work

Re: Storing, indexing and searching XML documents in Solr

2011-05-18 Thread Mike Sokolov
You might want to create a field that's analyzed using HtmlStripCharFilter - this will index all the non-tag/non-attribute text in the document, and if you store the value, will store the entire XML document as well. I've done some work on an XmlStripCharFilter, which does the same thing (onl

Re: Storing, indexing and searching XML documents in Solr

2011-05-18 Thread Judioo
Great document. I can see how to import the data direct from the database. However it seems as though I need to write xpath's in the config to extract the fields that I wish to transform into an solr document. So it seems that there is no way of storing the document structure in solr as is? 2011

Re: Storing, indexing and searching XML documents in Solr

2011-05-18 Thread Judioo
The data is being imported directly from mysql. The document is however indeed a good starting place. Thanks 2011/5/18 Yury Kats > On 5/18/2011 4:19 PM, Judioo wrote: > > > Any help is greatly appreciated. Pointers to documentation that address > my > > issues is even more helpful. > > I think t

Re: Storing, indexing and searching XML documents in Solr

2011-05-18 Thread Yury Kats
On 5/18/2011 4:19 PM, Judioo wrote: > Any help is greatly appreciated. Pointers to documentation that address my > issues is even more helpful. I think this would be a good start: http://wiki.apache.org/solr/DataImportHandler#Usage_with_XML.2BAC8-HTTP_Datasource