: However, I'd like to hear a comment on the approach of doing the parsing : using Lucene and then constructing a SolrQuery from a Lucene Query:
I believe you are asking about doing this in the client code? using the Lucene QueryParser to parse a string using an analyzer, then toString'ing that and sending it across hte wire to Solr? i would strongly advise against it. Query.toString() is intended purely as a debugging tool, not as a serialization mechanism. It's very possible for the toString() value of a query to not be useful in attempting to recreate the query -- particularly if the analyzer being used by Solr for the "re-parse" doesn't know to expect terms that have already been stemmed, or modified in the various ways the clinet may hvae done so (and if you have to go to all that work to make solr know about what you've pre-analyzed, why not just let solr do it for you?) : Similarly, at indexing time: ... : What are the drawbacks of this approach? Hmmm... well besides hte drawback of doing all the hard work solr will do for you, i suppose that as long as you are extremely careful to manage both the indexing side and the query side externally from Solr then there is nothing wrong with this appraoch -- you would essentailly just have a single field type in your schema.xml that would use a whitespace tokenizer -- but again, this would make you lose out on a lot of solr's features (notably: the stored values in your index would be the post-analyze tokens, you would be force to trust the clients 100% to send you clean data at index and query time intead of being able to configure it centrally, etc...) In short: i don't see any advantages, but i see a lot of room for error. -Hoss