Hi FMC, On 5/3/2011 at 12:37 PM, FatMan Corp wrote: > Hi, I would like to get another's field information for the same document > within a Tekonizer class. > How can this be achieved?
Use <copyField>s in your schema <http://wiki.apache.org/solr/SchemaXml#Copy_Fields>, and associate different analysis pipelines with each field. Each field's analysis pipeline will be fed the original raw text. Presently Lucene's analysis pipeline is single-field only: you have to create separate analysis pipelines for each field, with an extra pass over the original text for each field. I personally think Lucene should provide multi-field analysis capabilities, but this would not be a simple change. Even if Lucene does eventually gain this capability, modifying Solr to expose it would be an added layer of complexity, and given that <copyField> already exists as a workaround, there may be little motivation to do so. Some of the use cases full multi-field analysis could serve are already handled in Lucene (but not yet in Solr) by TeeSinkTokenFilter <http://lucene.apache.org/java/3_1_0/api/core/org/apache/lucene/analysis/TeeSinkTokenFilter.html>. An enterprising Lucene user could write a single-pass tokenizer that emits tokens with one type per target field, then employ one TeeSinkTokenFilter per field to approximate full multi-field analysis. Adding TeeSinkTokenFilter support to Solr, though, would require substantial changes to Solr's code and schema format (schema schema?). Steve > -----Original Message----- > From: FatMan Corp [mailto:fatmanc...@gmail.com] > Sent: Tuesday, May 03, 2011 12:37 PM > To: solr-user@lucene.apache.org > Subject: Getting field information inside a Tokenizer > > Hi, I would like to get another's field information for the same document > within a Tekonizer class. > How can this be achieved? > > Thanks