Re: Single multilingual field analyzed based on other field values

davetroiano Tue, 29 Oct 2013 09:27:13 -0700

Hi Trey,

I was reading v9 of the Solr in Action MEAP but browsing your github repo,
so I think I'm looking at the latest stuff.


Agreed that the thread caching idea is dangerous.  Perhaps it would work
now, but it could easily break in a later version of Solr.

I didn't mention another reason why I'd like to analyze based on other field
values, which is that I'd like the ability to run analyzers on sub-sections
of the MultiTextField.  e.g., given a multilingual document, run my
text_english analyzer on the first half of a document and my text_french
analyzer on the second half.  Of course, I could extend the prepend approach
to take start and end offsets (e.g., <field
name="myField">[en_0_1000,fr_1001_2500|]blah, blah, ...</field>), but if it
were possible I'd rather grab that data from another field and simplify the
tokenizer (in terms of the string manipulation and having to adjust position
offsets to ignore the prepended data... though you've already done the
tricky part).

Based on what I'm seeing on the message boards and JIRA (e.g., SOLR-1536 /
SOLR-1327 not being fixed), it seems like there isn't a clean way to run
analyzers dynamically based on data in other field(s).  If I end up trying
the caching idea, I'll report my findings here.

Thanks,
Dave



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Single-multilingual-field-analyzed-based-on-other-field-values-tp4098141p4098242.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Single multilingual field analyzed based on other field values

Reply via email to