If I understand it, you are sending the file to Solr which then uses Tika library to do the preprocessing/extraction and stores the results in the defined fields .
If you don't want Solr to do the storing and want to change extracted fields, just use the Tika library in your client and work with returned document yourself. This is less of a network load as well, as you don't send the whole file over the wire. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Jan 11, 2013 at 3:55 PM, uwe72 <uwe.clem...@exxcellent.de> wrote: > i have a bit strange usecase. > > when i index a pdf to solr i use ContentStreamUpdateRequest. > The lucene document then contains in the "text" field all containing items > (the parsed items of the physical pdf). > > i also need to add these parsed items to another lucene document. > > is there a way, to receive/parse these items just in memory, without > comitting them to lucene? > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/SolrJ-ContentStreamUpdateRequest-Accessing-parsed-items-without-committing-to-solr-tp4032636.html > Sent from the Solr - User mailing list archive at Nabble.com. >