Hi Solr users,

I'm investigating indexers for a project, played a bit with both Solr and Nutch recently, and the Solr "RESTful indexing component" concept fits our needs quite well.

Before I dig too deep, are there any known limitations w.r.t indexing of non-english text?

I know Lucene fully supports multi-language indexing, and I've seen the cool language identifiers and analysis factories in Nutch, but there's little information about multi-language indexing in Solr - hence my question.

The project that I'm looking at is currently single-language (French), which I assume can be handled by static configuration of the appropriate analyzers.

But we might have to make sure we can handle multiple languages cleanly in a single index before making a final decision on which indexer to use, as here in Switzerland we very often have to handle multiple languages.

Thanks for any insights on this subject!

-Bertrand

(brief introduction: I'm a committer on the Cocoon project, independent consultant, helping teams build webapps using Cocoon and other mostly Java-based technologies, more info at http:// www.codeconsult.ch)

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to