Oh, and to make matters even more "interesting", for
docValues=true fields there's no need to even store
anything, you can return the fields in the fl list that
are docValues=true, stored=false.......
On Tue, Nov 15, 2016 at 1:53 AM, Prateek Jain J
<prateek.j.j...@ericsson.com> wrote:
>
> Thanks a lot Erick
>
>
> Regards,
> Prateek Jain
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: 14 November 2016 09:14 PM
> To: solr-user <solr-user@lucene.apache.org>
> Subject: Re: index and data directories
>
> Theoretically, perhaps. And it's quite true that stored data for fields 
> marked stored=true are just passed through verbatim and compressed on disk 
> while the data associated with indexed=true fields go through an analysis 
> chain and are stored in a much different format. However these different data 
> are simply stored in files with different suffixes in a segment. So you might 
> have _0.fdx, _0.fdt, _0.tim, _0.tvx etc. that together form a single segment.
>
> This is done on a per-segment basis. So certain segment files, namely the 
> *.fdt and *.fdx file will contain the stored data while other extensions have 
> the indexed data, see: "File naming" here for a somewhat out of date format, 
> but close enough for this discussion:
> https://lucene.apache.org/core/4_0_0/core/org/apache/lucene/codecs/lucene40/package-summary.html.
> And there's no option to store the *.fdt and *.fdx files independently from 
> the rest of the segment files.
>
> This statement: "I mean documents which are to be indexed" really doesn't 
> make sense. You send these things called Solr documents to be indexed, but 
> they are just a set of fields with values handled as their definitions 
> indicate (i.e. respecting stored=true|false, indexed=true false, 
> docValues=true|false. The Solr document sent by SolrJ is simply thrown away 
> after processing into segment files.
>
> If you're sending semi-structured docs (say Word, PDF etc) to be indexed 
> through Tika they are simply transformed into a Solr doc (set of field/value 
> pairs) and the original document is thrown away as well. There's no option to 
> store the original semi-structured doc either.
>
>
> Best,
> Erick
>
> On Mon, Nov 14, 2016 at 12:35 PM, Prateek Jain J 
> <prateek.j.j...@ericsson.com> wrote:
>>
>> By data, I mean documents which are to be indexed. Some fields can be 
>> stored="true" but that doesn’t matter.
>>
>> For example: App1 creates an object (AppObj) to be indexed and sends it to 
>> SOLR via solrj. Some of the attributes of this object can be declared to be 
>> used for storage.
>>
>> Now, my understanding is data and indexes generated on data are two separate 
>> things. In my particular example, all fields have stored="true" but only 
>> selected fields have indexed="true". My expectation is, indexes are stored 
>> separately from data because indexes can be generated by different 
>> techniques/algorithms but data/documents remain unchanged. Please correct me 
>> if my understanding is not correct.
>>
>>
>> Regards,
>> Prateek Jain
>>
>> -----Original Message-----
>> From: Erick Erickson [mailto:erickerick...@gmail.com]
>> Sent: 14 November 2016 07:05 PM
>> To: solr-user <solr-user@lucene.apache.org>
>> Subject: Re: index and data directories
>>
>> The question is pretty opaque. What do you mean by "data" as opposed to 
>> "indexes"? Are you talking about where Lucene puts stored="true"
>> fields? If not, what do you mean by "data"?
>>
>> If you are talking about where Lucene puts the stored="true" bits the no, 
>> there's no way to segregate that our from the other files that make up a 
>> segment.
>>
>> Best,
>> Erick
>>
>> On Mon, Nov 14, 2016 at 7:58 AM, Prateek Jain J 
>> <prateek.j.j...@ericsson.com> wrote:
>>>
>>> Hi Alex,
>>>
>>>  I am unable to get it correctly. Is it possible to store indexes and data 
>>> separately?
>>>
>>>
>>> Regards,
>>> Prateek Jain
>>>
>>> -----Original Message-----
>>> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com]
>>> Sent: 14 November 2016 03:53 PM
>>> To: solr-user <solr-user@lucene.apache.org>
>>> Subject: Re: index and data directories
>>>
>>> solr.xml also has a bunch of properties under the core tag:
>>>
>>>   <cores adminPath="/admin/cores">
>>>     <core name="core0" instanceDir="core0">
>>>       <property name="dataDir" value="/data/core0"/></core>
>>>     <core name="core1" instanceDir="core1"/>
>>>   </cores>
>>>
>>> You can get the Reference Guide for your specific version here:
>>> http://archive.apache.org/dist/lucene/solr/ref-guide/
>>>
>>> Regards,
>>>    Alex.
>>> ----
>>> Solr Example reading group is starting November 2016, join us at 
>>> http://j.mp/SolrERG Newsletter and resources for Solr beginners and 
>>> intermediates:
>>> http://www.solr-start.com/
>>>
>>>
>>> On 15 November 2016 at 02:37, Prateek Jain J <prateek.j.j...@ericsson.com> 
>>> wrote:
>>>>
>>>> Hi All,
>>>>
>>>> We are using solr 4.8.1 and would like to know if it is possible to
>>>> store data and indexes in separate directories? I know following tag
>>>> exist in solrconfig.xml file
>>>>
>>>> <!-- Data Directory Used to specify an alternate directory to hold all 
>>>> index
>>>>                                 data other than the default ./data under 
>>>> the Solr home. If replication is
>>>>                                 in use, this should match the replication 
>>>> configuration. -->
>>>>                 <dataDir>C:/del-it/solr/cm_events_nbi/data</dataDir>
>>>>
>>>>
>>>>
>>>> Regards,
>>>> Prateek Jain

Reply via email to