Hi guys, I've enabled language detection in solrconfig.xml:
<updateRequestProcessorChain name="langid"> <processor class=" org.apache.solr.update.processor.TikaLanguageIdentifierUpdateProcessorFactory "> <lst name="defaults"> <str name="langid.fl">content,title</str> <str name="langid.fallback">en</str> <str name="langid.langField">language_s</str> <str name="langid.lcmap">en_GB:en en_US:en</str> <str name="langid.map.lcmap">en_GB:en en_US:en</str> </lst> </processor> </updateRequestProcessorChain> Then I have: <requestHandler name="/update" class="solr.UpdateRequestHandler"> <!-- See below for information on defining updateRequestProcessorChains that can be used by name on each Update Request --> <lst name="defaults"> <str name="update.chain">langid</str> </lst> </requestHandler> When I try to index a document, it's not added to the SOLR index. If I remove the above code, everything works fine. Do i need to make any specific changes to the schema.xml? Here is an excerpt of it : <field name="title" type="string" indexed="true" stored="true" required=" false" multiValued="false" /> <field name="title_en" type="string" indexed="true" stored="true" required=" false" multiValued="false" /> <field name="content" type="multilang_text_exact" indexed="true" stored=" true"/> <fieldType name="multilang_text_exact" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true"> <analyzer type="index"> <tokenizer class="solr.LetterTokenizerFactory"/> </analyzer> <analyzer type="query"> <tokenizer class="solr.LetterTokenizerFactory"/> </analyzer> </fieldType> I don't get any errors in the SOLR console output. Do i need to add _en and _<LANG ID> suffixes to all fields in my schema, for the above to work? I mean, do i need to have title, title_en, title_jp, and so on - manually defined in the schema? I still don't understand why a document isn't added at all, without any error being thrown. Thank you, Angel