Hi Solr users,I'm investigating indexers for a project, played a bit with both Solr and Nutch recently, and the Solr "RESTful indexing component" concept fits our needs quite well.
Before I dig too deep, are there any known limitations w.r.t indexing of non-english text?
I know Lucene fully supports multi-language indexing, and I've seen the cool language identifiers and analysis factories in Nutch, but there's little information about multi-language indexing in Solr - hence my question.
The project that I'm looking at is currently single-language (French), which I assume can be handled by static configuration of the appropriate analyzers.
But we might have to make sure we can handle multiple languages cleanly in a single index before making a final decision on which indexer to use, as here in Switzerland we very often have to handle multiple languages.
Thanks for any insights on this subject! -Bertrand(brief introduction: I'm a committer on the Cocoon project, independent consultant, helping teams build webapps using Cocoon and other mostly Java-based technologies, more info at http:// www.codeconsult.ch)
smime.p7s
Description: S/MIME cryptographic signature