See below: On Mon, Jan 30, 2012 at 10:16 PM, bing <nibing_...@hotmail.com> wrote: > Hi, Erick Erickson, > > Your suggestions are sound. > > For (1), if I use SolrJ as the client to access Solr, then java coding > becomes the most challenging part. Technically, I want to achieve the same > effect with highlighting, faceting search, language detection, etc. Do you > know some example SC that I can refer to? >
It's actually surprisingly easy. You want to use either the CommonsHttpSolrServer or the StreamingUpdateSolrServer to connect to a Solr instance. >From there, you assemble a list of SolrInputDocument and call server.add(list). The basic bits are about 25 lines of code. Adding Tika in is almost equally as easy. Don't know of any canned code lying around though. Best Erick > For (2), I agree with you on the difficulty in detecting language from just > a few words. Thus, alternatively I can suggest a set of results and let > users to decide. > You also mentioned score. Say, I have not so many cores, and so for every > query I direct it to all the cores, returned with a set of scores. Is it > confident to conclude that the highest score gives the most confidence of > the results? > Absolutely not, I mislead you a bit in my original suggestion. The cores all have independent statistics, so the scores are not comparable. Sorry about that! This is not as bad a problem if you simply have different *fields* per language in a single core, but still is a concern. > Thanks. > > Best Regards, > Ni Bing > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Multilingual-search-in-multicore-solr-tp3698969p3702041.html > Sent from the Solr - User mailing list archive at Nabble.com.