Re: Unicode Character Problem

Furkan KAMACI Mon, 12 Dec 2016 07:55:14 -0800

Hi Ahmet,

I don't see any weird character when I manual copy it to any text editor.


On Sat, Dec 10, 2016 at 6:19 PM, Ahmet Arslan <iori...@yahoo.com.invalid>
wrote:

> Hi Furkan,
>
> I am pretty sure this is a pdf extraction thing.
> Turkish characters caused us trouble in the past during extracting text
> from pdf files.
> You can confirm by performing manual copy-paste from original pdf file.
>
> Ahmet
>
>
> On Friday, December 9, 2016 8:44 PM, Furkan KAMACI <furkankam...@gmail.com>
> wrote:
> Hi,
>
> I'm trying to index Turkish characters. These are what I see at my index (I
> see both of them at different places of my content):
>
> aç �klama
> açıklama
>
> These are same words but indexed different (same weird character at first
> one). I see that there is not a weird character when I check the original
> PDF file.
>
> What do you think about it. Is it related to Solr or Tika?
>
> PS: I use text_general for analyser of content field.
>
> Kind Regards,
> Furkan KAMACI
>

Re: Unicode Character Problem

Reply via email to