Re: Stored vs non-stored very large text fields

2014-05-05 Thread Jochen Barth
I'll found out that "storing" Documents as separate docs+id does not help either. You must have an completely separate collection/core to get things work fast. Kind regards, Jochen Zitat von Jochen Barth : Ok, https://wiki.apache.org/solr/SolrPerformanceFactors states that: "Retrieving the

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Ok, https://wiki.apache.org/solr/SolrPerformanceFactors states that: "Retrieving the stored fields of a query result can be a significant expense. This cost is affected largely by the number of bytes stored per document--the higher byte count, the sparser the documents will be distributed o

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Something is really strange here: even when configuring fields id + sort_... to docValues="true" -- so there's nothing to get from "stored documents file" -- performance is still terrible with ocr stored=true _even_ with my patch which stores uncompressed like solr4.0.0 (checked with string

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Dear Shawn, see attachment for my first "brute force" no-compression attempt. Kind regards, Jochen Zitat von Shawn Heisey : On 4/29/2014 4:20 AM, Jochen Barth wrote: BTW: stored field compression: are all "stored fields" within a document are put into one compressed chunk, or by per-field b

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Shawn Heisey
On 4/29/2014 4:20 AM, Jochen Barth wrote: > BTW: stored field compression: > are all "stored fields" within a document are put into one compressed chunk, > or by per-field basis? Here's the issue that added the compression to Lucene: https://issues.apache.org/jira/browse/LUCENE-4226 It was made

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
BTW: stored field compression: are all "stored fields" within a document are put into one compressed chunk, or by per-field basis? Kind regards, J. Barth > > Regards, >Alex. > Personal website: http://www.outerthoughts.com/ > Current project: http://www.solr-start.com/ - Accelerating your

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Jochen Barth
Am 29.04.2014 11:19, schrieb Alexandre Rafalovitch: > Couple of random thoughts: > 1) The latest (4.8) Solr has support for nested documents, as well as > for expand components. Maybe that will let you have more efficient > architecture: http://heliosearch.org/expand-block-join/ Yes, I've seen thi

Re: Stored vs non-stored very large text fields

2014-04-29 Thread Alexandre Rafalovitch
Couple of random thoughts: 1) The latest (4.8) Solr has support for nested documents, as well as for expand components. Maybe that will let you have more efficient architecture: http://heliosearch.org/expand-block-join/ 2) Do you return OCR text to the client? Or just search it? If just search it,