Re: Indexing large text documents

2010-01-05 Thread Grant Ingersoll
I haven't tried it, but you might be able to use either (and this is just me thinking aloud): DataImportHandler with the FileEntityProcessor Remote Streaming - (you might have to write out Solr XML or do something else) -Grant On Jan 5, 2010, at 4:05 AM, Mark N wrote: > SolrInputDocument doc1

Re: Indexing large text documents

2010-01-05 Thread Glen Newton
(In Lucene) I break the document into smaller pieces, then add each piece to the Document field in a loop. This seems to work better, but will mess-around with analysis like term offsets. This should work in your example. In Lucene, you can also add the field using a Reader to the file in question

Indexing large text documents

2010-01-05 Thread Mark N
SolrInputDocument doc1 = new SolrInputDocument(); doc1.addField( "Fulltext", strContent); strContent is a string variable which contains contents of text file. ( assume that text file is located in c:\files\abc.txt ) In my case abc.text ( text files ) could be very huge ~ 2 GB so it is not a