In a word, no. If you don't store the data it is completely gone with no chance of retrieval.
There are a couple of things to think about though 1> The original doc must exist somewhere. Store some kind of URI in Solr that you can use to retrieve the original doc on demand. 2> Go ahead and store the data. Disk space is cheap, and the stored data goes in special files (*.fdt) that have very little impact on either search speed or memory requirements. And the memory requirements can be controlled somewhat with the documentCache assuming you don't have gigantic docs. This kind of sidesteps the question of re-extracting the document on Solr on demand and returning the text (which I think is what you're asking). I would definitely avoid doing this even if I knew how. The problem here is that you're making Solr do quite intensive work (Tika extraction) while at the same time serving queries what has negative performance implications. It it turns out that you have to do this, consider running Tika in the app layer and doing the extraction on demand there. It's not very hard, see: https://lucidworks.com/blog/indexing-with-solrj/ and ignore the db bits. Best, Erick On Thu, Jul 9, 2015 at 7:53 PM, trung.ht <trung...@anlab.vn> wrote: > Hi everyone, > > I use solr to index and search in office file (docx, pptx, ...). To reduce > the size of solr index, I do not store the content of the file on solr, > however now my customer want to preview the content of the file. > > I have read the document of ExtractingRequestHandler, but it seems that to > return content in the response from solr, the only option is to > set extractOnly=true, but in that case, solr would not index the file. > > My question is: is there anyway for solr to extract the content from tika, > index the content (without storing it) and then give me the content in the > response? > > Thanks in advanced and sorry because my explanation is confusing. > > Trung.