Well, I'd approach either of these use cases by simply performing my computations on the input and storing the result in another (non-indexed unless I wanted to search it) field. This wouldn't happen in the Analyzer, but in the code that populated the document fields.....
Which is a much cleaner solution IMO than creating some sort of "index this but store that" capability. The purpose of analysis is to produce *searchable* tokens after all. But we're getting into angels dancing on pins here. Do you actually have a use case you're trying to implement or is this mostly theoretical? Erick On Thu, Jan 7, 2010 at 2:08 PM, MitchK <mitc...@web.de> wrote: > > The difference between stored and indexed is clear now. > > You are right, if you are responsing only to "normal users". > > Use case: > You got a stored field "The good, the bad and the ugly". > And you got a really fantastic analyzer, which is doing some magic to this > movie title. > Let's say, the analyzer translates the title into md5 or into another > abstract expression. > Instead of doing the same magical function on the client's side again and > again, he only needs to take the prepared data from your response. > > Another use case could be: > Imagine you have got two categories: cheap and expensive and your document > gots a title-, a label-, an owner- and a price-field. > Imagine you would analyze, index and store them like you normally do and > afterwards you want to set, whether the document belongs to the expensive > item-group or not. > If the price for the item is higher than 500$, it belongs to the expensive > ones, otherwise not. > I think, this would be a job for a special analyzer - and this only makes > sense, if I also store the analyzed data. > > I think information retrieval is a really interesting use case. > > > Erick Erickson wrote: > > > > What is your use case for "responding sometimes with the indexed value"? > > Other than reconstructing a field that hasn't been stored, I can't think > > of > > one. > > > > I still think you're missing the point. Indexing and storing are > > orthogonal operations that have (almost) nothing to do with each > > other, for all that they happen at the same time on the same field. > > > > You never search against the stored data in a field. You *always* > > search against the indexed data. > > > > Contrariwise, you never display the indexed form to the user, you > > *always* show the stored data (unless you come up with > > a really interesting use case). > > > > Step back and consider what happens when you index data, > > it gets broken up all kinds of ways. Stop words are removed, > > case may change, etc, etc, etc. It makes no sense to > > then display this data for a user. Would you really like > > to have, say a movie title "The Good, The Bad, and The > > Ugly". Remove stopwords, puncuation and lowercase > > and you index three tokens "good", "bad", "ugly". > > Even if you reconstruct this field, the user would see > > "good bad ugly". Bad, very bad. > > > > Yet I want to display the original title to the user in > > response to searching on "ugly", so I need the > > original, unanalyzed data. > > > > Perhaps it would help to think of it this way. > > 1> take some data and index it in f1 > > but do NOT store it in f1. Store it in f2 > > but do NOT index it in f2. > > 2> take that same data, index AND store > > it in f3. > > > > <1> is almost entirely equivalent to <2> > > in terms of index resources. > > > > Practically though, <1> is harder to use, > > because you have to remember > > to use f1 for searching and f2 for getting > > the raw data. > > > > HTH > > Erick > > > > On Thu, Jan 7, 2010 at 12:11 PM, MitchK <mitc...@web.de> wrote: > > > >> > >> Thank you, Ryan. I will have a look on lucene's material and luke. > >> > >> I think I got it. :) > >> > >> Sometimes there will be the need, to response on the one hand the value > >> and > >> on the other hand the indexed version of the value. > >> How can I fullfill such needs? Doing copyfield on indexed-only fields? > >> > >> > >> > >> ryantxu wrote: > >> > > >> > > >> > On Jan 7, 2010, at 10:50 AM, MitchK wrote: > >> > > >> >> > >> >> Eric, > >> >> > >> >> you mean, everything is okay, but I do not see it? > >> >> > >> >>>> Internally for searching the analysis takes place and writes to the > >> >>>> index in an inverted fashion, but the stored stuff is left alone. > >> >> > >> >> if I use an analyzer, Solr "stores" it's output two ways? > >> >> One public output, which is similar to the original input > >> >> and one "hidden" or internal output, which is based on the > >> >> analyzer's work? > >> >> Did I understand that right? > >> > > >> > yes. > >> > > >> > indexed fields and stored fields are different. > >> > > >> > Solr results show stored fields in the results (however facets are > >> > based on indexed fields) > >> > > >> > Take a look at Lucene in Action for a better description of what is > >> > happening. The best tool to get your head around what is happening is > >> > probably luke (http://www.getopt.org/luke/) > >> > > >> > > >> >> > >> >> If yes, I have got another problem: > >> >> I don't want to waste any diskspace. > >> > > >> > You have control over what is stored and what is indexed -- how that > >> > is configured is up to you. > >> > > >> > ryan > >> > > >> > > >> > >> -- > >> View this message in context: > >> > http://old.nabble.com/Custom-Analyzer-Tokenizer-works-but-results-were-not-saved-tp27026739p27063452.html > >> Sent from the Solr - User mailing list archive at Nabble.com. > >> > >> > > > > > > -- > View this message in context: > http://old.nabble.com/Custom-Analyzer-Tokenizer-works-but-results-were-not-saved-tp27026739p27065305.html > Sent from the Solr - User mailing list archive at Nabble.com. > >