Sorry figured out my problem. It was stupid mistake on my part. Once again sorry for that
Thanks Farhan On Wed, Mar 5, 2014 at 7:14 PM, Farhan Ali <farhan....@gmail.com> wrote: > Hi, > I am a newbie to Solr and I am trying to index some xml documents using > DIH and XPath but I am unable to do it. I get a response message of > successful indexing but no document is added to the index. I do not know > what i m doing wrong. > > This is my data config xml file > > > <dataConfig> > <dataSource type="FileDataSource"/> > <document> > <entity name="nytxmldir" rootEntity="false" > datasource="null" > processor="FileListEntityProcessor" > fileName=".*\.xml" > recursive="true" > baseDir="/home/farhan/Downloads/nytxml" > > > > <entity name="nytxml" > pk="id" > datasource="nytxmldir" > url="${nytxmldir.fileAbsolutePath}" > processor="XPathEntityProcessor" > forEach="/ntif" > transformer="RegexTransformer"> > > <field column="id" > xpath="/ntif/head/docdata/doc-id/@id-string"/> > <field column="title" > xpath="/ntif/head/title"/> > <field column="paragraph" > xpath="/ntif/body/body.content/block[@class='full_text']/p"/> > > </entity> > </entity> > </document> > </dataConfig> > > > > > > This is my xml document > > > <?xml version="1.0" encoding="UTF-8"?> > <!DOCTYPE nitf SYSTEM " > http://www.nitf.org/IPTC/NITF/3.3/specification/dtd/nitf-3-3.dtd"> > <nitf change.date="June 10, 2005" change.time="19:30" > version="-//IPTC//DTD NITF 3.3//EN"> > <head> > <title>Paid Notice: Deaths BRADLEY, CAROL L.</title> > <meta content="dn010107" name="slug"/> > <meta content="1" name="publication_day_of_month"/> > <meta content="1" name="publication_month"/> > <meta content="2007" name="publication_year"/> > <meta content="Monday" name="publication_day_of_week"/> > <meta content="Classified" name="dsk"/> > <meta content="7" name="print_page_number"/> > <meta content="B" name="print_section"/> > <meta content="3" name="print_column"/> > <meta content="Paid Death Notices" name="online_sections"/> > <docdata> > <doc-id id-string="1815719"/> > <doc.copyright holder="The New York Times" year="2007"/> > <identified-content> > <person class="indexing_service">BRADLEY, CAROL L.</person> > <classifier class="online_producer" type="types_of_material">Paid > Death Notice</classifier> > <classifier class="online_producer" > type="taxonomic_classifier">Top/Classifieds/Paid Death Notices</classifier> > </identified-content> > </docdata> > <pubdata date.publication="20070101T000000" ex-ref=" > http://query.nytimes.com/gst/fullpage.html?res=9B06E1DE1E3AF932A35752C0A9619C8B63" > item-length="49" name="The New York Times" unit-of-measure="word"/> > </head> > <body> > <body.head> > <hedline> > <hl1>Paid Notice: Deaths BRADLEY, CAROL L.</hl1> > </hedline> > </body.head> > <body.content> > <block class="lead_paragraph"> > <p>BRADLEY--Carol L., 84, of Tinton Falls, NJ died peacefully at > Seabrook Village on December 27. Beloved wife of Floyd (Pete) Bradley, Jr.; > loving mother of Steven, Floyd and Lynette Bradley; adored grandmother of > Victoria Kent and Camilla, William and Melissa Bradley; caring > stepgrandmother of Matthew and Charlton Field.</p> > </block> > <block class="full_text"> > <p>BRADLEY--Carol L., 84, of Tinton Falls, NJ died peacefully at > Seabrook Village on December 27. Beloved wife of Floyd (Pete) Bradley, Jr.; > loving mother of Steven, Floyd and Lynette Bradley; adored grandmother of > Victoria Kent and Camilla, William and Melissa Bradley; caring > stepgrandmother of Matthew and Charlton Field.</p> > </block> > </body.content> > </body> > </nitf> > > > I am really stumped as to why it is not working. I know DIH does not > support full XPath syntax but according to the wiki it supports the limited > XPath syntax that I am using. Also I have read various internet forums and > people have suggested to use groovy and xlts which I am unfamiliar with. > I hope someone can help me. > > Thanks > Farhan > > > >