On Sat, Jan 24, 2009 at 6:30 PM, Paul Libbrecht <p...@activemath.org> wrote: > is good practice to post large solr update documents? > (e.g. 100kb-2mb). > Will solr do the necessary tricks to make the field use a reader instead of > strings?
Solr will stream a *document* at a time from the input stream fine, but it can't stream a *field* at a time. The reason is that Lucene's Document class needs all the fields at once when indexing, and there isn't a way to make multiple Readers from a single InputStream (the HTTP POST) w/o bringing it into memory anyway. I have thought about being able to specify single fields as different streams though (to be handled via a Reader and never brought entirely into memory)... perhaps something like a &fieldstream.myfieldname=URL_to_respurce -Yonik