Yes, this is the most recent version of Solr, stream="true" and stopwords, lowercase and removeDuplicate being applied to all multivalued fields? Would the filters possibly be causing this? I will not use them and see what happens.
Kyle Shalin Shekhar Mangar wrote: > > Hmm, strange. > > This is Solr 1.3.0, right? Do you have any transformers applied to these > multi-valued fields? Do you have stream="true" in the entity? > > On Tue, Sep 30, 2008 at 11:01 PM, KyleMorrison <[EMAIL PROTECTED]> wrote: > >> >> I apologize for spamming this mailing list with my problems, but I'm at >> my >> wits end. I'll get right to the point. >> >> I have an xml file which is ~1GB which I wish to index. If that is >> successful, I will move to a larger file of closer to 20GB. However, when >> I >> run my data-config(let's call it dc.xml) over it, the import only manages >> to >> get about 27 rows, out of roughly 200K. The exact same >> data-config(dc.xml) >> works perfectly on smaller data files of the same type. >> >> This data-config is quite large, maybe 250 fields. When I run a smaller >> data-config (let's call it sdc.xml) over the 1GB file, the sdc.xml works >> perfectly. The only conclusion I can draw from this is that the >> data-config >> method just doesn't scale well. >> >> When the dc.xml fails, the server logs spit out: >> >> Sep 30, 2008 11:40:18 AM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/dataimport params={command=full-import} >> status=0 >> QTime=95 >> Sep 30, 2008 11:40:18 AM org.apache.solr.handler.dataimport.DataImporter >> doFullImport >> INFO: Starting Full Import >> Sep 30, 2008 11:40:18 AM org.apache.solr.update.DirectUpdateHandler2 >> deleteAll >> INFO: [] REMOVING ALL DOCUMENTS FROM INDEX >> Sep 30, 2008 11:40:20 AM org.apache.solr.handler.dataimport.DataImporter >> doFullImport >> SEVERE: Full Import failed >> java.util.ConcurrentModificationException >> at >> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) >> at java.util.AbstractList$Itr.next(AbstractList.java:343) >> at >> >> org.apache.solr.handler.dataimport.DocBuilder.addFieldValue(DocBuilder.java:402) >> at >> >> org.apache.solr.handler.dataimport.DocBuilder.addFields(DocBuilder.java:373) >> at >> >> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304) >> at >> >> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178) >> at >> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136) >> at >> >> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334) >> at >> >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386) >> at >> >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) >> Sep 30, 2008 11:41:18 AM org.apache.solr.core.SolrCore execute >> INFO: [] webapp=/solr path=/dataimport params={command=full-import} >> status=0 >> QTime=77 >> Sep 30, 2008 11:41:18 AM org.apache.solr.handler.dataimport.DataImporter >> doFullImport >> INFO: Starting Full Import >> Sep 30, 2008 11:41:18 AM org.apache.solr.update.DirectUpdateHandler2 >> deleteAll >> INFO: [] REMOVING ALL DOCUMENTS FROM INDEX >> Sep 30, 2008 11:41:19 AM org.apache.solr.handler.dataimport.DataImporter >> doFullImport >> SEVERE: Full Import failed >> java.util.ConcurrentModificationException >> at >> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372) >> at java.util.AbstractList$Itr.next(AbstractList.java:343) >> at >> >> org.apache.solr.handler.dataimport.DocBuilder.addFieldValue(DocBuilder.java:402) >> at >> >> org.apache.solr.handler.dataimport.DocBuilder.addFields(DocBuilder.java:373) >> at >> >> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304) >> at >> >> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178) >> at >> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136) >> at >> >> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334) >> at >> >> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386) >> at >> >> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377) >> >> This mass of exceptions DOES NOT occur when I perform the same >> full-import >> with sdc.xml. As far as I can tell, the only difference between the two >> files is the amount of fields they contain. >> >> Any guidance or information would be greatly appreciated. >> Kyle >> >> >> PS The schema.xml in use specifies almost all fields as multivalued, and >> has >> a copyfield for almost every field. I can fix this if it is causing my >> problem, but I would prefer not to. >> -- >> View this message in context: >> http://www.nabble.com/Indexing-Large-Files-with-Large-DataImport%3A-Problems-tp19746831p19746831.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > -- > Regards, > Shalin Shekhar Mangar. > > -- View this message in context: http://www.nabble.com/Indexing-Large-Files-with-Large-DataImport%3A-Problems-tp19746831p19749039.html Sent from the Solr - User mailing list archive at Nabble.com.