sorry <field column="textContent" xpath="/document/category/BODY" flatten="true"/>
2009/8/19 Noble Paul നോബിള് नोब्ळ् <noble.p...@corp.aol.com>: > try this > <field column="textContent" xpath="/document/category/BODY" faltten="true"/> > > this should slurp al the tags under body > > On Wed, Aug 19, 2009 at 1:44 PM, venn hardy<venn.ha...@hotmail.com> wrote: >> >> Hello, >> >> I have just started trying out SOLR to index some XML documents that I >> receive. I am >> using the SOLR 1.3 and its HttpDataSource in conjunction with the >> XPathEntityProcessor. >> >> >> >> I am finding the data import really useful so far, but I am having a few >> problems when >> I try and import HTML contained within one of the XML tags <BODY>. The data >> import just seems >> to ignore the textContent silently but it imports everything else. >> >> >> >> When I do a query through the SOLR admin interface, only the id and author >> fields are displayed. >> >> Any ideas what I am doing wrong? >> >> >> >> Thanks >> >> >> >> This is what my dataConfig looks like: >> <dataConfig> >> <dataSource type="HttpDataSource" /> >> <document> >> <entity name="archive" pk="id" >> url="http://localhost:9080/data/20090817070752.xml" >> processor="XPathEntityProcessor" forEach="/document/category" >> transformer="DateFormatTransformer" stream="true" dataSource="dataSource"> >> <field column="id" xpath="/document/category/reference" /> >> <field column="textContent" xpath="/document/category/BODY" /> >> <field column="author" xpath="/document/category/author" /> >> </entity> >> </document> >> </dataConfig> >> >> >> >> This is how I have specified my schema >> <fields> >> <field name="id" type="string" indexed="true" stored="true" >> required="true" /> >> <field name="author" type="string" indexed="true" stored="true"/> >> <field name="textContent" type="text" indexed="true" stored="true" /> >> </fields> >> >> <uniqueKey>id</uniqueKey> >> <defaultSearchField>id</defaultSearchField> >> >> >> >> And this is what my XML document looks like: >> >> <document> >> <category> >> <reference>123456</reference> >> <author>Authori name</author> >> <BODY> >> <P>Lorem ipsum dolor sit amet, consectetur adipiscing elit. >> Morbi lorem elit, lacinia ac blandit ac, tristique et ante. Phasellus >> varius varius felis ut vestibulum</P> >> <P>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi lorem >> elit, >> lacinia ac blandit ac, tristique et ante. Phasellus varius varius felis ut >> vestibulum</P> >> <P>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi lorem >> elit, >> lacinia ac blandit ac, tristique et ante. Phasellus varius varius felis ut >> vestibulum</P> >> </BODY> >> </category> >> </document> >> >> _________________________________________________________________ >> Looking for a place to rent, share or buy this winter? Find your next place >> with Ninemsn property >> http://a.ninemsn.com.au/b.aspx?URL=http%3A%2F%2Fninemsn%2Edomain%2Ecom%2Eau%2F%3Fs%5Fcid%3DFDMedia%3ANineMSN%5FHotmail%5FTagline&_t=774152450&_r=Domain_tagline&_m=EXT > > > > -- > ----------------------------------------------------- > Noble Paul | Principal Engineer| AOL | http://aol.com > -- ----------------------------------------------------- Noble Paul | Principal Engineer| AOL | http://aol.com