Hi, I am new to Solr japanese. I added the support for japanese on schema.xml How can i insert Japanese text into that field either by solr client (java / php / ruby ) or by curl
schema.xml ==================================== <field name="username" type="string" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" /> <field name="timestamp" type="date" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" /> <field name="jtxt" type="text_ja" indexed="true" stored="true" multiValued="true" omitNorms="true" termVectors="true" /> <fieldType name="text_ja" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="false"> <analyzer> <tokenizer class="solr.JapaneseTokenizerFactory" mode="search"/> <!--<tokenizer class="solr.JapaneseTokenizerFactory" mode="search" userDictionary="lang/userdict_ja.txt"/>--> <!-- Reduces inflected verbs and adjectives to their base/dictionary forms (辞書形) --> <filter class="solr.JapaneseBaseFormFilterFactory"/> <!-- Removes tokens with certain part-of-speech tags --> <filter class="solr.JapanesePartOfSpeechStopFilterFactory" tags="lang/stoptags_ja.txt" /> <!-- Normalizes full-width romaji to half-width and half-width kana to full-width (Unicode NFKC subset) --> <filter class="solr.CJKWidthFilterFactory"/> <!-- Removes common tokens typically not useful for search, but have a negative effect on ranking --> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_ja.txt" /> <!-- Normalizes common katakana spelling variations by removing any last long sound character (U+30FC) --> <filter class="solr.JapaneseKatakanaStemFilterFactory" minimumLength="4"/> <!-- Lower-cases romaji characters --> <filter class="solr.LowerCaseFilterFactory"/> </analyzer> </fieldType> ==================================== my insert.csv file "id","username","timestamp","content","jtxt" "999999999","xxxxx","2013-12-26T10:14:26Z","Hello ","マイ ドキュメント" ========================= I am trying to insert through curl it gives me error curl "http://localhost:8983/solr/collection1/update/csv?separator=,&commit=true" -H "Content-Type: text/plain; charset=utf-8" --data-binary @insert.csv ERROR ---------------------------- <?xml version="1.0" encoding="UTF-8"?> <response> <lst name="responseHeader"><int name="status">400</int><int name="QTime">23</int ></lst><lst name="error"><str name="msg">Document is missing mandatory uniqueKey field: id</str><int name="code">400</int></lst> </response> I know i should not use "Content-Type as text/plain" ========================= Thanks