Stored data is compressed by default, anecdotally there's about a 2:1 compression ratio.
But the _other_ reason not to store all the data is that it then gets replicated. If you have master/slave or SolrCloud with replicas, you have N copies of your index and each and every one of them has a copy of all your stored data.... Best, Erick On Mon, May 9, 2016 at 6:14 AM, Ali Nazemian <alinazem...@gmail.com> wrote: > Dear Erick, > Hi, > Thank you very much. About the storing part you are right, unless the > primary datastore uses some kind of data compression which in my case it > does (I am using Cassandra as a primary datastore), and I am not sure about > Solr that it has any kind of compression or not. > According to your reply, it seems that I have to do that in a hard way. I > mean using the primary datastore to build the index from scratch. > > Sincerely, > > On Sun, May 8, 2016 at 11:07 PM, Erick Erickson <erickerick...@gmail.com> > wrote: > >> bq: I would be grateful if somebody could introduce other way of >> re-indexing >> the whole data without using another datastore >> >> Not possible currently. Consider what's _in_ the index when stored="false". >> The actual terms are the output of the entire analysis chain, including >> stemming, stopword removal, synonym substitution etc. Since the >> indexing process is lossy, you simply cannot reconstruct the original >> stream from the indexed terms. >> >> I suppose one _could_ do this in the case of docValues only index with >> the new return-values-from-docvalues functionality, but even that's lossy >> because the order of returned values may not be the original insertion >> order. And if that suits your needs, a pretty simple driver program would >> suffice. >> >> To do this from indexed-only terms you'd have to somehow store the >> original version of each term or store some codes indicating exactly >> how to reconstruct the original steam, which very possibly would take >> up as much space as if you'd just stored the values anyway. _And_ it >> would burden every one else who didn't want to do this with a bloated >> index. >> >> Best, >> Erick >> >> On Sun, May 8, 2016 at 4:25 AM, Ali Nazemian <alinazem...@gmail.com> >> wrote: >> > Dear all, >> > Hi, >> > I was wondering, is it possible to re-index Solr 6.0 data in case of >> > store=false? I am using Solr as a secondary datastore, and for the sake >> of >> > space efficiency all the fields (except id) are considered as >> store=false. >> > Currently, due to some changes in application business, Solr schema >> should >> > change, and in order to see the effect of changing schema on old data, I >> > have to do the re-index process. I know that one way of re-indexing in >> > Solr is reading data from one collection (core) and inserting that to >> > another one, but this solution is not possible for store=false fields, >> and >> > re-indexing the whole data through primary datastore is kind of costly, >> so >> > I would be grateful if somebody could introduce other way of re-indexing >> > the whole data without using another datastore. >> > >> > Sincerely, >> > >> > -- >> > A.Nazemian >> > > > > -- > A.Nazemian