Hi,
I wish to index well formed xml documents as they are. I have a database filled with MARCXML records. An example of these looks like this: <record ns0:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" xmlns="http://www.loc.gov/MARC21/slim" xmlns:ns0="http://www.w3.org/2001/XMLSchema-instance"> <leader>00000nam 22 a 4500</leader> <controlfield tag="001">000500000</controlfield> <controlfield tag="005">20050826220257.0</controlfield> <controlfield tag="008">000710s1998 xx r 000 0 dut d</controlfield> <datafield ind1=" " ind2=" " tag="040"> <subfield code="a">Univ</subfield> </datafield> <datafield ind1="1" ind2=" " tag="100"> <subfield code="a">van Wetten, J. W.</subfield> </datafield> <datafield ind1="1" ind2="3" tag="245"> <subfield code="a">De positie van vrouwen in de asielprocedure /</subfield> <subfield code="c">J.W. van Wetten, N. Dijkhof, F. Heide.</subfield> </datafield> </record> The idea is to create Lucene indexes on specific MARC fields and store the complete MARC record in Lucene 'as is'. In the presentation layer of my application I would then have this complete MARC record at hand, and as such have full flexibility on which MARC fields to display. So I want to create the following record through XSLT and feed this to SOLR. <doc> <field name="title">De positie van vrouwen in de asielprocedure</field> <field name="author">van Wetten, J. W.</field> ... <field name="originalRecord"> <record ns0:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" xmlns="http://www.loc.gov/MARC21/slim" xmlns:ns0="http://www.w3.org/2001/XMLSchema-instance"> <leader>00000nam 22 a 4500</leader> <controlfield tag="001">000500000</controlfield> <controlfield tag="005">20050826220257.0</controlfield> <controlfield tag="008">000710s1998 xx r 000 0 dut d</controlfield> <datafield ind1=" " ind2=" " tag="040"> <subfield code="a">UGent</subfield> </datafield> <datafield ind1="1" ind2=" " tag="100"> <subfield code="a">van Wetten, J. W.</subfield> </datafield> <datafield ind1="1" ind2="3" tag="245"> <subfield code="a">De positie van vrouwen in de asielprocedure /</subfield> <subfield code="c">J.W. van Wetten, N. Dijkhof, F. Heide.</subfield> </datafield> </record> </field> </doc> I have the following in my schema.xml: <field name="author" type="text" indexed="true" stored="true" termVectors="true"/> <field name="title" type="text" indexed="true" stored="true" termVectors="true"/> <field name="originalRecord" type="text" indexed="false" stored="true"/> SOLR has of course a problem with the XML in the 'originalRecord' field. Is there a solution to this? Has anyone done this before? Thanks a lot. Benoit. ============================= PAUWELS Benoit Université Libre de Bruxelles - Libraries Head of Automation Av. F.D. Roosevelt 50, CP 180 1050 BRUSSELS Belgium Tel: + 32 2 650 23 91 Fax: + 32 2 650 23 91 =============================