: > When i index a text field which has arabic and English like this tweet : > “@anaga3an: هو سعد الحريري بيعمل ايه غير تحديد الدوجلاس ويختار الكرافته ؟؟” : > #gcc #ksa #lebanon #syria #kuwait #egypt #سوريا : > with field_type as 'text_ar' and when i try to see the same field again in : > solr, it is shown as below. : > RT @AhmedWagih: لو معملناش ØØ§Ø¬Ø© Ù???ÙŠ الزيادة : > السكانية Ù???ÙŠ مصر، هنتØÙˆÙ„ لدولة Ù???قيرة : > كثيÙ???Ø© السكان زي بنجلادش #Egypt #EgyEconomy : The encoding of your input text is being mangled at some point. : Presuming that your original encoding is UTF-8, I would look at : how you are indexing into Solr, and the encoding settings on the : Java container. Solr itself handles UTF-8 perfectly fine, as do : most Java containers if configured properly, so my first suspicion : would be the indexing code.
right -- the key thing is to narrow down wether the charset of your data is getting mangled between the db -> solr or between solr -> your eyes I would suggest you start by looking at some of the sample documents that come with solr which include non ASCII characters, and indexing those using the post.jar that is provided. if those show up fine for you in solr, then your servlet container probably isn't doing the munging -- there is also a "test_utf8.sh" in the exampledocs directory that can help you verify if your servlet container is working properly. If you rule that out, then the next step is to look at your database, and the way your JDBC driver (what DIH uses to talk to your database) is working. Some databases have the concept of a "default charset" but then individual columns can override that with some other charset, and database specific commandline tools know might know about those (so your data looks fine when you run SQL statements directly) but external clients have no way of knowing unless specially configured. For example: the MySQL jdbc driver has some special options you can use to force it to use unicode and to specify which charset to use when returning data... https://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html -Hoss