Yes, this is the most recent version of Solr, stream="true" and stopwords,
lowercase and removeDuplicate being applied to all multivalued fields? Would
the filters possibly be causing this? I will not use them and see what
happens.

Kyle


Shalin Shekhar Mangar wrote:
> 
> Hmm, strange.
> 
> This is Solr 1.3.0, right? Do you have any transformers applied to these
> multi-valued fields? Do you have stream="true" in the entity?
> 
> On Tue, Sep 30, 2008 at 11:01 PM, KyleMorrison <[EMAIL PROTECTED]> wrote:
> 
>>
>> I apologize for spamming this mailing list with my problems, but I'm at
>> my
>> wits end. I'll get right to the point.
>>
>> I have an xml file which is ~1GB which I wish to index. If that is
>> successful, I will move to a larger file of closer to 20GB. However, when
>> I
>> run my data-config(let's call it dc.xml) over it, the import only manages
>> to
>> get about 27 rows, out of roughly 200K. The exact same
>> data-config(dc.xml)
>> works perfectly on smaller data files of the same type.
>>
>> This data-config is quite large, maybe 250 fields. When I run a smaller
>> data-config (let's call it sdc.xml) over the 1GB file, the sdc.xml works
>> perfectly. The only conclusion I can draw from this is that the
>> data-config
>> method just doesn't scale well.
>>
>> When the dc.xml fails, the server logs spit out:
>>
>> Sep 30, 2008 11:40:18 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/dataimport params={command=full-import}
>> status=0
>> QTime=95
>> Sep 30, 2008 11:40:18 AM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> INFO: Starting Full Import
>> Sep 30, 2008 11:40:18 AM org.apache.solr.update.DirectUpdateHandler2
>> deleteAll
>> INFO: [] REMOVING ALL DOCUMENTS FROM INDEX
>> Sep 30, 2008 11:40:20 AM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> SEVERE: Full Import failed
>> java.util.ConcurrentModificationException
>>        at
>> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
>>        at java.util.AbstractList$Itr.next(AbstractList.java:343)
>>        at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.addFieldValue(DocBuilder.java:402)
>>        at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.addFields(DocBuilder.java:373)
>>        at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304)
>>        at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
>>        at
>>
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
>>        at
>>
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
>>        at
>>
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>> Sep 30, 2008 11:41:18 AM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/dataimport params={command=full-import}
>> status=0
>> QTime=77
>> Sep 30, 2008 11:41:18 AM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> INFO: Starting Full Import
>> Sep 30, 2008 11:41:18 AM org.apache.solr.update.DirectUpdateHandler2
>> deleteAll
>> INFO: [] REMOVING ALL DOCUMENTS FROM INDEX
>> Sep 30, 2008 11:41:19 AM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> SEVERE: Full Import failed
>> java.util.ConcurrentModificationException
>>        at
>> java.util.AbstractList$Itr.checkForComodification(AbstractList.java:372)
>>        at java.util.AbstractList$Itr.next(AbstractList.java:343)
>>        at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.addFieldValue(DocBuilder.java:402)
>>        at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.addFields(DocBuilder.java:373)
>>        at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304)
>>        at
>>
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
>>        at
>>
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
>>        at
>>
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
>>        at
>>
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>>
>> This mass of exceptions DOES NOT occur when I perform the same
>> full-import
>> with sdc.xml. As far as I can tell, the only difference between the two
>> files is the amount of fields they contain.
>>
>> Any guidance or information would be greatly appreciated.
>> Kyle
>>
>>
>> PS The schema.xml in use specifies almost all fields as multivalued, and
>> has
>> a copyfield for almost every field. I can fix this if it is causing my
>> problem, but I would prefer not to.
>> --
>> View this message in context:
>> http://www.nabble.com/Indexing-Large-Files-with-Large-DataImport%3A-Problems-tp19746831p19746831.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-Large-Files-with-Large-DataImport%3A-Problems-tp19746831p19749039.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to