Just extend XPathEntityProcessor override nextRow() after 100 . Use it as your processor return null;
On Tue, Jun 24, 2008 at 10:23 AM, mike segv <[EMAIL PROTECTED]> wrote: > > That fixed it. > > If I'm inserting millions of documents, how do I control docs/update? E.g. > if there are 50K docs per file, I'm thinking that I should probably code up > my own DataSource that allows me to stipulate docs/update. Like say, 100 > instead of 50K. Does this make sense? > > Mike > > > Noble Paul നോബിള് नोब्ळ् wrote: >> >> hi , >> You have not registered any datasources . the second entity needs a >> datasource. >> Remove the dataSource="null" and add a name for the second entity >> (good practice). No need for baseDir attribute for second entity . >> See the modified xml added below >> --Noble >> >> <dataConfig> >> <dataSource type="FileDataSource"/> >> <document> >> <entity name="f" processor="FileListEntityProcessor" fileName=".*xml" >> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false" >> dataSource="null" baseDir="/san/tomcat-services/solr-medline"> >> <entity name="x" processor="XPathEntityProcessor" >> forEach="/MedlineCitation" >> url="${f.fileAbsolutePath}" > >> <field column="pmid" xpath="/MedlineCitation/PMID"/> >> </entity> >> </entity> >> </document> >> </dataConfig> >> >> On Tue, Jun 24, 2008 at 6:39 AM, mike segv <[EMAIL PROTECTED]> wrote: >>> >>> I'm trying to use the fileListEntityProcessor to add some xml documents >>> to a >>> solr index. I'm running a nightly version of solr-1.3 with SOLR-469 and >>> SOLR-563. I've been able to successfuly run the slashdot httpDataSource >>> example. My data-config.xml file loads without errors. When I attempt >>> the >>> full-import command I get the exception below. Thanks for any help. >>> >>> Mike >>> >>> WARNING: No lockType configured for >>> /san/tomcat-services/solr-medline/solr/data/index/ assuming 'simple' >>> Jun 23, 2008 7:59:49 PM org.apache.solr.handler.dataimport.DataImporter >>> doFullImport >>> SEVERE: Full Import failed >>> java.lang.RuntimeException: java.lang.NullPointerException >>> at >>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:97) >>> at >>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:212) >>> at >>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:166) >>> at >>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:149) >>> at >>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:286) >>> at >>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:312) >>> at >>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:179) >>> at >>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:140) >>> at >>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:335) >>> at >>> org.apache.solr.handler.dataimport.DataImporter.rumCmd(DataImporter.java:386) >>> at >>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) >>> Caused by: java.lang.NullPointerException >>> at java.io.Reader.<init>(Reader.java:61) >>> at java.io.BufferedReader.<init>(BufferedReader.java:76) >>> at com.bea.xml.stream.MXParser.checkForXMLDecl(MXParser.java:775) >>> at com.bea.xml.stream.MXParser.setInput(MXParser.java:806) >>> at >>> com.bea.xml.stream.MXParserFactory.createXMLStreamReader(MXParserFactory.java:261) >>> at >>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:93) >>> ... 10 more >>> >>> Here is my data-config: >>> >>> <dataConfig> >>> <document> >>> <entity name="f" processor="FileListEntityProcessor" fileName=".*xml" >>> newerThan="'NOW-10DAYS'" recursive="true" rootEntity="false" >>> dataSource="null" baseDi >>> r="/san/tomcat-services/solr-medline"> >>> <entity processor="XPathEntityProcessor" forEach="/MedlineCitation" >>> url="${f.fileAbsolutePath}" dataSource="null"> >>> <field column="pmid" xpath="/MedlineCitation/PMID"/> >>> </entity> >>> </entity> >>> </document> >>> </dataConfig> >>> >>> And a snippet from an xml file: >>> <MedlineCitation Owner="PIP" Status="MEDLINE"> >>> <PMID>12236137</PMID> >>> <DateCreated> >>> <Year>1980</Year> >>> <Month>01</Month> >>> <Day>03</Day> >>> </DateCreated> >>> >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18081671.html >>> Sent from the Solr - User mailing list archive at Nabble.com. >>> >>> >> >> >> >> -- >> --Noble Paul >> >> > > -- > View this message in context: > http://www.nabble.com/Attempting-dataimport-using-FileListEntityProcessor-tp18081671p18083747.html > Sent from the Solr - User mailing list archive at Nabble.com. > > -- --Noble Paul