Thanks Erick that's great advice as always it's very much appreciated. I've never seen an example of that pattern used before (stored=false,indexed=false,useDocValuesAsStored=true) on any of the fantastic solr blogs I've read and I've read lot of them many times (all of your excellent Lucidworks posts, Yonik's personal blog, Rafal's sematext posts, Tote from the RDL, and most of the Lucene/Solr revolution youtube clips etc.). I found the relevant section in the solr doco it's really a little gem of a pattern I did not know docValues could provide that feature and I've always believed I needed stored=true if I required the fields returned.
Many thanks once again for taking the time to respond, Dwane ________________________________ From: Erick Erickson <erickerick...@gmail.com> Sent: Thursday, 15 November 2018 1:55 PM To: solr-user Subject: Re: Exporting results and schema design Well, docValues doesn't necessarily waste much index space if you don't store the field and useDocValuesAsStored. It also won't beat up your machine as badly if you fetch all your fields from DV fields. To fetch a stored field, you need to > seek to the stored data on disk > decompress a 16K block minimum > fetch the stored fields. So using docvalues rather than stored for "1000s" of rows will avoid that cycle. You can use the cursorMark to page efficiently, your middleware would have to be in charge of that. Best, Erick On Wed, Nov 14, 2018 at 6:35 PM Dwane Hall <dwaneh...@hotmail.com> wrote: > > Good afternoon Solr community, > > I have a situation where I require the following solr features. > > 1. Highlighting must be available for the matched search results > > 2. After a user performs a regular solr search (/select, rows=10) I > require a drill down which has the potential to export a large number of > results (1000s +). > > > Here is how I’ve approached each of the two problems above > > > 1.Highlighting must be available for the matched search results > > My implementation is in pattern with the recommend approach. A stored=false > indexed=true copy field with the individual fields to highlight analysed, > stored=true, indexed=false. > > <field name="First_Names" type="text_general" indexed="false" stored="true" > multiValued="true"/> > > <field name="Last_Names" type="text_general" indexed="false" stored="true" > multiValued="true"/> > > <copyField source=" First_Names " dest=”Full_Names"/> > > <copyField source=" Last_Names " dest="Full_Names"/> > > <field name="Full_Names" type="text_general" indexed="true" stored="false" > multiValued="true"/> > > > 2.After a user performs a regular solr search (/select, rows=10) I require a > drill down which has the potential to export a large number of results > (1000s+ with no searching required over the fields) > > > From all the documentation the recommended approach for returning large > result sets is using the /export request handler. As none of my fields > qualify for using then /export handler (i.e. docValues=true) is my only > option to have additional duplicated fields mapped as strings so they can be > used in the export process? > > i.e. using my example above now managed-schema now becomes > > <field name="First_Names" type="text_general" indexed="false" stored="true" > multiValued="true"/> > > <field name="Last_Names" type="text_general" indexed="false" stored="true" > multiValued="true"/> > > <copyField source=" First_Names " dest=”Full_Names"/> > > <copyField source=" Last_Names " dest="Full_Names"/> > > <field name="Full_Names" type="text_general" indexed="true" stored="false" > multiValued="true"/> > > > <copyField source=" First_Names " dest=”First_Names_Str"/> > > <copyField source=" Last_Names " dest="Last_Names_Str"/> > > <field name="First_Names_Str" type="string" indexed="false" stored="true" > multiValued="true"/> > > <field name="Last_Names_Str" type="string" indexed="false" stored="true" > multiValued="true"/> > > > If I did not require highlighting I could change the initial mapped fields > (First_Names, Last_Names) from type=text_general to type=string and save the > additional storage in the index but in my current situation I can’t see a way > around having to duplicate all the fields required for the /export handler as > strings. Is this how people typically handle this problem or am I completely > off the mark with my design? > > > Any advice would be greatly appreciated, > > > Thanks > > > Dwane