Hi, Thank you so much for replying.
The MySQL database server is running on a Fedora Core 12 Machine with Hindi Language Support enabled. Details of the database are - ENGINE=3DMyISAM and DEFAULT CHARSET=3Dutf8 Data is imported using the Solr DataImportHandler (mysql jdbc driver). In the schema.xml file the title field is defined as: <field name="title" type="text_general" indexed="true" stored="true""/> I tried saving the query results directly to a text file from the MySQL command prompt but it is not storing the results correctly. The file contains the following characters. à ¤¸à ¥Åà ¤° à ¤Šà ¤°à ¥<8d>à ¤Åà ¤¾ Saur oorja First line of the data-config.xml is <?xml version="1.0" encoding="UTF-8"?> Please suggest what I have to do to solve this issue. Regards, Sanjailal KP On 5/21/12, Jack Krupansky <j...@basetechnology.com> wrote: > Is it possible that your text editor/display does not support UTF-8 > encoding? > > Assuming the data is properly encoded, do you have the encoding="UTF-8" > attribute in your DIH dataSource tag? > > -- Jack Krupansky > > -----Original Message----- > From: KP Sanjailal > Sent: Monday, May 21, 2012 7:37 AM > To: solr-user@lucene.apache.org > Subject: Re: Indexing & Searching MySQL table with Hindi and English data > > Hi, > > Thank you so much for replying. > > The MySQL database server is running on a Fedora Core 12 Machine with Hindi > Language Support enabled. Details of the database are - ENGINE=MyISAM and > DEFAULT CHARSET=utf8 > > Data is imported using the Solr DataImportHandler (mysql jdbc driver). > In the schema.xml file the title field is defined as: > <field name="title" type="text_general" indexed="true" stored="true"/> > > I tried saving the query results directly to a text file from the MySQL > command prompt but it is not storing the results correctly. The file > contains the following characters. > > > à ¤¸à ¥Åà ¤° à ¤Šà ¤°à ¥<8d>à ¤Åà ¤¾ Saur oorja > > Please suggest what I have to do to solve this issue. > > Regards, > > Sanjailal KP > -- > > > > On Sun, May 20, 2012 at 6:59 AM, Lance Norskog <goks...@gmail.com> wrote: > >> Also, try saving data from a query into a file and verify that it is >> UTF-8 and the characters are correct. >> >> On Fri, May 18, 2012 at 7:54 AM, Jack Krupansky <j...@basetechnology.com> >> wrote: >> > Check the analyzers for the field types containing Hindi text to be >> > sure >> > that they are not using a character mapping or "folding" filter that >> might >> > mangle the Hindi characters. Post the field type, say for the "title" >> field. >> > >> > Also, try manually (using curl or the post jar) adding a single >> > document >> > that has Hindi data and see if that works. >> > >> > -- Jack Krupansky >> > >> > -----Original Message----- From: KP Sanjailal >> > Sent: Thursday, May 17, 2012 5:55 AM >> > To: solr-user@lucene.apache.org >> > Subject: Indexing & Searching MySQL table with Hindi and English data >> > >> > >> > Hi, >> > >> > I tried to setup indexing of MySQL tables in Apache Solr 3.6. >> > >> > Everything works fine but text in Hindi script (only some 10% of total >> > records) not getting indexed properly. >> > >> > A search with keyword in Hindi retrieve emptly result set. Also a >> > retrieved hindi record displays junk characters. >> > >> > The database tables contains bibliographical details of books such as >> > title, author, publisher, isbn, publishing place, series etc. and out >> > of >> > the total records about 10% of records contains text in Hindi in title, >> > author, publisher fields. >> > >> > Example: >> > >> > *Search Results from MySQL using PHP* >> > >> > 1. >> > <http://192.168.0.132/shared/biblio_view.php?bibid=26913&tab=opac> >> > *Title:* सौर ऊर्जा Saur >> > oorja<http://192.168.0.132/shared/biblio_view.php?bibid=26913&tab=opac> >> > *Author(s):* विनोद कुमार मिश्र MISHRA (VK) *Material:* Books ** ** >> > *Search Results from Apache Solr (searched using keyword in English)* >> > >> > 1. >> > <http://192.168.0.132/test/biblio_view.php?bibid=26913&tab=opac> >> > *Title:* सौर ऊरॠजा Saur >> > oorja<http://192.168.0.132/test/biblio_view.php?bibid=26913&tab=opac> >> > *Author(s):* विनोद कॠमार मिशॠर MISHRA >> > (VK) >> * >> > Material:* Books >> > >> > >> > How do I go about solving this language problem. >> > >> > Thanks in advace. >> > >> > K. P. Sanjailal >> > -- >> > >> >> >> >> -- >> Lance Norskog >> goks...@gmail.com >> > >