Hello Arno, If the function is ready, I would like to test it in our special unit for HTML folder listings. Can you post it here or send privately?
Best Regards, SZ On Fri, Oct 10, 2008 at 4:41 PM, Arno Garrels <[EMAIL PROTECTED]> wrote: > Arno Garrels wrote: > > Francois PIETTE wrote: > >>> But 3 bytes looks like UTF-8 ? > >> > >> I don't know. You said it was UTF-16 if not encoded. > > > > I installed IIS 7 on my Vista box and I found that IIS 7 > > uses UTF-7 in directory listings. > > Arrgh, typo above, IIS v7 uses UTF-8 of course! > > > The HTTP header contains > > the "charset=UTF-8" content-type extension. > > > > > > However I think the ICS server should continue to use HTML > > enitities. > > HTML entities represent both iso-8859-1 (Latin1) and Unicode > > character numbers (in Unicode the first 256 chars are the same as > > Latin1). So in order to create a _valid_ mapping a AnsiString MUST be > > converted with current ANSI code page to a UnicodeString/WideString > > first! This can be achieved easily in TextToHtmlText() by a local > > WideString variable that is assigned parameter Src : String. > > Characters above #255 must the be represented as numerical HTML > > entities (&#nnnn;). That's all, fully backwards compatible and > > works in D2009 as well :) > > > > -- > > Arno Garrels > > > > > >> > >> ----- Original Message ----- > >> From: "Arno Garrels" <[EMAIL PROTECTED]> > >> To: "ICS support mailing" <[email protected]> > >> Sent: Thursday, October 09, 2008 7:03 PM > >> Subject: Re: [twsocket] HTML encoding in HttpSrv func. > >> TextToHtmlText() > >> > >> > >>> Francois PIETTE wrote: > >>>>> The twothird character is not 'encoded' either as "⅔" > >>>>> (decimal) or as "⅔" (hex)? If so, IIS sends plain UTF-16! > >>>> > >>>> Yes, no encoding at all. Just the 3 bytes. So UTF-16. > >>> > >>> But 3 bytes looks like UTF-8 ? > >>> > >>> -- > >>> Arno Garrels > >>> > >>>> > >>>> -- > >>>> [EMAIL PROTECTED] > >>>> http://www.overbyte.be > >>>> > >>>> > >>>> ----- Original Message ----- > >>>> From: "Arno Garrels" <[EMAIL PROTECTED]> > >>>> To: "ICS support mailing" <[email protected]> > >>>> Sent: Thursday, October 09, 2008 5:26 PM > >>>> Subject: Re: [twsocket] HTML encoding in HttpSrv func. > >>>> TextToHtmlText() > >>>> > >>>> > >>>>> Francois Piette wrote: > >>>>>>> Yes, if someone has Apache or a newer IIS installed he could > >>>>>>> help. Create a file name with characters not in current ANSI > >>>>>>> code page by copy those characters from the Windows application > >>>>>>> charmap.exe. Than start a packet sniffer and log a directory > >>>>>>> listing. > >>>>>> > >>>>>> Using IIS6 on W2K3. > >>>>> > >>>>> Thanks! > >>>>> > >>>>>> The twothird character (U+2154) is sent in the dirlist as 3 > >>>>>> characters : 0xE2 0x85 0x94. In the href link, the 3 characters > >>>>>> are expressed as %e2%85%94 > >>>>> > >>>>> That's UTF-8 URL-encoded. > >>>>> > >>>>>> while they are binary in the text itself. > >>>>> > >>>>> The twothird character is not 'encoded' either as "⅔" > >>>>> (decimal) or as "⅔" (hex)? If so, IIS sends plain UTF-16! > >>>>> > >>>>>> There is nothing in the html header to tell which code page or > >>>>>> charset is used. -- > >>>>> > >>>>> Browsers seem to be very good in detecting the correct character > >>>>> set nowadays. > >>>>> > >>>>> -- > >>>>> Arno Garrels > -- To unsubscribe or change your settings for TWSocket mailing list please goto http://lists.elists.org/cgi-bin/mailman/listinfo/twsocket Visit our website at http://www.overbyte.be
