I guess you can mention a JdbcDataSource property
characterEncoding="UTF8" and it should help

On Sat, Mar 21, 2009 at 10:58 AM, aerox7 <amyne.berr...@me.com> wrote:
>
> Hi,
> I've cheked MySql conf with "mysql> SHOW VARIABLES LIKE 'character_set%'; "
> : all character_set are in UTF-8.
>
> I think that dataimporter get data in ISO. so the i just write a custom
> transformer to change the row's charset from iso to utf and now it work.
>
> --> Noble Paul : I use SOLR 1.4 Nighty 2009-03-18 build. i have to download
> the last one to apply your patch ?
>
>
> Noble Paul നോബിള്‍  नोब्ळ् wrote:
>>
>> May be there is an issue with the recent changes with SOLR-973
>> I have given a new patch on SOLR-973
>> aerox ,is it possible to confirm if that is the problem
>>
>>
>> On Fri, Mar 20, 2009 at 6:52 PM, Grant Ingersoll <gsing...@apache.org>
>> wrote:
>>> Usually, when I see characters like this, it means you aren't
>>> viewing/handling the UTF-8 correctly when bringing it into Java.  I would
>>> first check that your DB or JDBC driver is getting the chars out right.
>>>  It
>>> may even be the case that they did not go into the DB correctly in the
>>> first
>>> place.
>>>
>>> On Mar 20, 2009, at 4:36 AM, aerox7 wrote:
>>>
>>>>
>>>> ==> where are you seeing it as ""Solène" as opposed to the
>>>> correct way of solène?
>>>>
>>>> I have "Solène" in my Mysql DATA BASE ! so i don't know if this is
>>>> correct
>>>> or not ? i gess that "Solène" is solène in UTF-8 ?!
>>>>
>>>> I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so
>>>> when i try with solène everything is ok ! but when i try with Solène
>>>> (like
>>>> what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!!
>>>>
>>>> I think that ISOLatin1AccentFilterFactory take only string with Charset
>>>> ISO-8859-1 .
>>>>
>>>> So any solution to transform my string to ISO-8859-1 before indexing
>>>> process. May be by creating transformer in DataImportHandler ? (Never
>>>> code
>>>> in java :( )
>>>>
>>>> Thank you all.
>>>>
>>>>
>>>> Koji Sekiguchi-2 wrote:
>>>>>
>>>>> aerox7 wrote:
>>>>>>
>>>>>> Hi,
>>>>>> I have a mysql data base in UTF-8. I have a row with "Solène"
>>>>>> (solène).
>>>>>> I
>>>>>> want to transforme this to solene, so i use Solr
>>>>>> ISOLatin1AccentFilterFactory to perform this task but it dosn't work
>>>>>> ?!!
>>>>>>
>>>>>> i gess that "Solène" is "solène" in UTF-8 ?! i also set tomcat to
>>>>>> utf-8
>>>>>> so
>>>>>> normaly ISOLatin1AccentFilterFactory have to replace the accent
>>>>>> .......
>>>>>>
>>>>>> any ideas ?
>>>>>>
>>>>>> i use DataImportHandler.
>>>>>>
>>>>>
>>>>> If a mapping rule "è" to "e" is always true in your field, you can try
>>>>> to use MappingCharFilter
>>>>> instead of ISOLatin1AccentFilter. Add the following line to
>>>>> mapping-ISOLatin1Accent.txt:
>>>>>
>>>>> "è" => "e"
>>>>>
>>>>> and add the following fieldType:
>>>>>
>>>>> <fieldType name="textCharNorm" class="solr.TextField"
>>>>> positionIncrementGap="100" >
>>>>>  <analyzer>
>>>>>   <charFilter class="solr.MappingCharFilterFactory"
>>>>> mapping="mapping-ISOLatin1Accent.txt"/>
>>>>>   <tokenizer class="solr.CharStreamAwareWhitespaceTokenizerFactory"/>
>>>>>  </analyzer>
>>>>> </fieldType>
>>>>>
>>>>> MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build.
>>>>>
>>>>> Koji
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>
>>>
>>
>>
>>
>> --
>> --Noble Paul
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22633051.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul

Reply via email to