NP, having something in the manual is A Good Thing, but it's very,
very easy to not find a paragraph in a 1,000+ page doc!

Oh, and I recommend downloading the PDF version of the Solr ref guide
for your version of solr for locally-searchable reference FWIW.

Best,
Erick
On Thu, Nov 15, 2018 at 1:26 AM Dwane Hall <dwaneh...@hotmail.com> wrote:
>
>
> Thanks Erick that's great advice as always it's very much appreciated. I've 
> never seen an example of that pattern used before 
> (stored=false,indexed=false,useDocValuesAsStored=true) on any of the 
> fantastic solr blogs I've read and I've read lot of them many times (all of 
> your excellent Lucidworks posts, Yonik's personal blog, Rafal's sematext 
> posts, Tote from the RDL, and most of the Lucene/Solr revolution youtube 
> clips  etc.).  I found the relevant section in the solr doco it's really a 
> little gem of a pattern I did not know docValues could provide that feature 
> and I've always believed I needed stored=true if I required the fields 
> returned.
>
> Many thanks once again for taking the time to respond,
>
> Dwane
>
> ________________________________
> From: Erick Erickson <erickerick...@gmail.com>
> Sent: Thursday, 15 November 2018 1:55 PM
> To: solr-user
> Subject: Re: Exporting results and schema design
>
> Well, docValues doesn't necessarily waste much index space if you
> don't store the field and useDocValuesAsStored. It also won't beat up
> your machine as badly if you fetch all your fields from DV fields. To
> fetch a stored field, you need to
>
> > seek to the stored data on disk
> > decompress a 16K block minimum
> > fetch the stored fields.
>
> So using docvalues rather than stored for "1000s" of rows will avoid that 
> cycle.
>
> You can use the cursorMark to page efficiently, your middleware would
> have to be in charge of that.
>
> Best,
> Erick
> On Wed, Nov 14, 2018 at 6:35 PM Dwane Hall <dwaneh...@hotmail.com> wrote:
> >
> > Good afternoon Solr community,
> >
> > I have a situation where I require the following solr features.
> >
> > 1.       Highlighting must be available for the matched search results
> >
> > 2.       After a user performs a regular solr search (/select, rows=10) I 
> > require a drill down which has the potential to export a large number of 
> > results (1000s +).
> >
> >
> > Here is how I’ve approached each of the two problems above
> >
> >
> > 1.Highlighting must be available for the matched search results
> >
> > My implementation is in pattern with the recommend approach.  A 
> > stored=false indexed=true copy field with the individual fields to 
> > highlight analysed, stored=true, indexed=false.
> >
> > <field name="First_Names" type="text_general" indexed="false" stored="true" 
> > multiValued="true"/>
> >
> > <field name="Last_Names" type="text_general" indexed="false" stored="true" 
> > multiValued="true"/>
> >
> > <copyField source=" First_Names " dest=”Full_Names"/>
> >
> > <copyField source=" Last_Names " dest="Full_Names"/>
> >
> > <field name="Full_Names" type="text_general" indexed="true" stored="false" 
> > multiValued="true"/>
> >
> >
> > 2.After a user performs a regular solr search (/select, rows=10) I require 
> > a drill down which has the potential to export a large number of results 
> > (1000s+ with no searching required over the fields)
> >
> >
> > From all the documentation the recommended approach for returning large 
> > result sets is using the /export request handler.  As none of my fields 
> > qualify for using then /export handler (i.e. docValues=true) is my only 
> > option to have additional duplicated fields mapped as strings so they can 
> > be used in the export process?
> >
> > i.e. using my example above now managed-schema now becomes
> >
> > <field name="First_Names" type="text_general" indexed="false" stored="true" 
> > multiValued="true"/>
> >
> > <field name="Last_Names" type="text_general" indexed="false" stored="true" 
> > multiValued="true"/>
> >
> > <copyField source=" First_Names " dest=”Full_Names"/>
> >
> > <copyField source=" Last_Names " dest="Full_Names"/>
> >
> > <field name="Full_Names" type="text_general" indexed="true" stored="false" 
> > multiValued="true"/>
> >
> >
> > <copyField source=" First_Names " dest=”First_Names_Str"/>
> >
> > <copyField source=" Last_Names " dest="Last_Names_Str"/>
> >
> > <field name="First_Names_Str" type="string" indexed="false" stored="true" 
> > multiValued="true"/>
> >
> > <field name="Last_Names_Str" type="string" indexed="false" stored="true" 
> > multiValued="true"/>
> >
> >
> > If I did not require highlighting I could change the initial mapped fields 
> > (First_Names, Last_Names) from type=text_general to type=string and save 
> > the additional storage in the index but in my current situation I can’t see 
> > a way around having to duplicate all the fields required for the /export 
> > handler as strings. Is this how people typically handle this problem or am 
> > I completely off the mark with my design?
> >
> >
> > Any advice would be greatly appreciated,
> >
> >
> > Thanks
> >
> >
> > Dwane

Reply via email to