I apologize for spamming this mailing list with my problems, but I'm at my wits end. I'll get right to the point.
I have an xml file which is ~1GB which I wish to index. If that is successful, I will move to a larger file of closer to 20GB. However, when I run my data-config(let's call it dc.xml) over it, the import only manages to get about 27 rows, out of roughly 200K. The exact same data-config(dc.xml) works perfectly on smaller data files of the same type. This data-config is quite large, maybe 250 fields. When I run a smaller data-config (let's call it sdc.xml) over the 1GB file, the sdc.xml works perfectly. The only conclusion I can draw from this is that the data-config method just doesn't scale well. When the dc.xml fails, the server logs spit out: Sep 30, 2008 11:40:18 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={command=full-import} status=0 QTime=95 Sep 30, 2008 11:40:18 AM org.apache.solr.handler.dataimport.DataImporter doFullImport INFO: Starting Full Import Sep 30, 2008 11:40:18 AM org.apache.solr.update.DirectUpdateHandler2 deleteAll INFO: [] REMOVING ALL DOCUMENTS FROM INDEX Sep 30, 2008 11:40:20 AM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at org.apache.solr.handler.dataimport.DocBuilder.addFieldValue(DocBuilder.java:402) at org.apache.solr.handler.dataimport.DocBuilder.addFields(DocBuilder.java:373) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) Sep 30, 2008 11:41:18 AM org.apache.solr.core.SolrCore execute INFO: [] webapp=/solr path=/dataimport params={command=full-import} status=0 QTime=77 Sep 30, 2008 11:41:18 AM org.apache.solr.handler.dataimport.DataImporter doFullImport INFO: Starting Full Import Sep 30, 2008 11:41:18 AM org.apache.solr.update.DirectUpdateHandler2 deleteAll INFO: [] REMOVING ALL DOCUMENTS FROM INDEX Sep 30, 2008 11:41:19 AM org.apache.solr.handler.dataimport.DataImporter doFullImport SEVERE: Full Import failed java.util.ConcurrentModificationException at java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) at java.util.AbstractList$Itr.next(AbstractList.java:343) at org.apache.solr.handler.dataimport.DocBuilder.addFieldValue(DocBuilder.java:402) at org.apache.solr.handler.dataimport.DocBuilder.addFields(DocBuilder.java:373) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) This mass of exceptions DOES NOT occur when I perform the same full-import with sdc.xml. As far as I can tell, the only difference between the two files is the amount of fields they contain. Any guidance or information would be greatly appreciated. Kyle PS The schema.xml in use specifies almost all fields as multivalued, and has a copyfield for almost every field. I can fix this if it is causing my problem, but I would prefer not to. -- View this message in context: http://www.nabble.com/Indexing-Large-Files-with-Large-DataImport%3A-Problems-tp19746831p19746831.html Sent from the Solr - User mailing list archive at Nabble.com.