DataImportHandler: UTF-8 and Mysql

gwk Mon, 12 Jan 2009 02:18:48 -0800

Hello,

First of all thanks to Jacob Singh for his reply on my mail last week, Icompletely forgot to reply. Multicore is perfect for my needs. I've gotSolr running now with my new schema partially implemented and I'vestarted to test importing data with DIH. I've run in to a number ofissues though and I hope someone here can help:


  1. Posting UTF-8 data through the example post-script works and I get
     the proper results back when I query using the admin page.
     However, data imported through the DataImportHandler from a MySQL
     database (the database contains correct data, it's a copy of a
     production db and selecting through the client gives the correct
     characters) I get "Ã³" instead of "ó". I've tried several
     combinations of arguments to my datasource url
     (useUnicode=true&characterEncoding=UTF-8) but it does not seem to
     help. How do I get this to work correctly?
  2. On the wikipage for DataImportHandler, the deletedPkQuery has no
     real description, am I correct in assuming it should contain a
     query which returns the ids of items which should be removed from
     the index?
  3. Another question concerning the DataImportHandler wikipage, I'm
     not sure about the exact way the field-tag works. From the first
     data-config.xml example for the full-import I can infer that the
     "column"-attribute represents the column from the sql-query and
     the "name"-attribute represents the name of the field in the
     schema the column should map to. However further on in the
     RegexTransformer section there are column-attributes which do not
     correspond to the sql-query result set and its the "sourceColName"
     attribute which acually represents that data, which comes from the
     RegexTransformer I understand but why then is the "column"
     attribute used instead of the "name"-attribute. This has confused
     me somewhat, any clarification would be greatly appreciated.

Regards,

gwk

DataImportHandler: UTF-8 and Mysql

Reply via email to