Re: Can I use per field analyzers and dynamic fields?

Paolo Castagna Wed, 12 May 2010 23:49:46 -0700

Chris Hostetter wrote:

: However, I'd like to hear a comment on the approach of doing the parsing
: using Lucene and then constructing a SolrQuery from a Lucene Query:
I believe you are asking about doing this in the client code? using theLucene QueryParser to parse a string using an analyzer, then toString'ingthat and sending it across hte wire to Solr?


Yes.

i would strongly advise against it.


Thank you.

Query.toString() is intended purely as a debugging tool, not as aserialization mechanism. It's very possible for the toString() value ofa query to not be useful in attempting to recreate the query --particularly if the analyzer being used by Solr for the "re-parse" doesn'tknow to expect terms that have already been stemmed, or modified in thevarious ways the clinet may hvae done so (and if you have to go to allthat work to make solr know about what you've pre-analyzed, why not justlet solr do it for you?)


Is there a (better) way to construct a Solr's SolrQuery object from a
Lucene's Query object?

: Similarly, at indexing time:
        ...
: What are the drawbacks of this approach?
Hmmm... well besides hte drawback of doing all the hard work solr will dofor you, i suppose that as long as you are extremely careful to manageboth the indexing side and the query side externally from Solr then thereis nothing wrong with this appraoch -- you would essentailly just have asingle field type in your schema.xml that would use a whitespace tokenizer-- but again, this would make you lose out on a lot of solr's features(notably: the stored values in your index would be the post-analyzetokens, you would be force to trust the clients 100% to send you cleandata at index and query time intead of being able to configure itcentrally, etc...)


The rationale for wanting doing all the analysis (both query time and
indexing time) client side is that I have an application which is using
Lucene and it is already doing that and it has some "unusual"
requirements (i.e. almost all fields are dynamicFields with
custom/configurable analyzers per field).

I completely agree with everything you said and with the "dangers" of
doing the analysis client side and then let Solr re-analyzing again
server side. However, as you suggested, a simple whitespace tokenizer
on Solr should be relatively safe.

Definitely, your previous suggestion of using dynamicFields for each
of the possible analyzer configurations and transparently mapping field
names with "prefixes"|"postfixes" to select the right dynamicField
"type" is a better option.

In short: i don't see any advantages, but i see a lot of room for error.


Yep. Got it.

Paolo


-Hoss

Re: Can I use per field analyzers and dynamic fields?

Reply via email to