"Costin Manolache" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
> On 3/17/06, Jean-frederic Clere <[EMAIL PROTECTED]> wrote:
>>
>> Costin Manolache wrote:
>>
>> >Sorry, I forgot there are 2 meanings of  'xml syntax' :-), I was 
>> >thinking
>> if
>> >the output
>> >is an xml file - with encoding in declaration, but in regular jsp. 
>> >(well,
>> >the patch is not dealing
>> >with jspx anyway )
>> >I was referring to the fact that <?xml encoding="iso-8859-2"?> is 
>> >treated
>> as
>> >template text,
>> >and pageEncoding (or web.xml ) takes precedence.
>> >In jsp-xml ( jspx ) it seems we report an error if the web.xml encoding
>> >doesn't match the
>> ><?xml?> encoding. I can't see many use cases for having an explicit
>> encoding
>> >in the
>> >xml header, and yet the file read with a different encoding.
>> >
>> >
>> In my case the xml header is:
>> <?xml version="1.0" encoding="OSD_EBCDIC_DF04_1"?> (In EBCDIC...)
>> Reading the file with ISO-8859-1 encoding only gives garbages.
>>
>> But the patch prevents reading the  <@page pageEncoding="bla" %> so it
>> is bad.
>
>
>
> Yes, the patch is bad - but what would be a good patch ?
>
> - if pageEncoding is not specified but document starts with <?xml
> encoding=...?> - use xml encoding

It would be Tomcat-specific, but +1, since it's in the spirit of using the 
<[EMAIL PROTECTED] contentType="text/xml; charset=OSD_EBCDIC_DF04_1" %> as the 
default 
if pageEncoding isn't specified.

> - if pageEncoding is specified and so is <?xml encoding?> - report an 
> error
> ( like jspx does ) or
> a warning or choose the xml encoding

-1, since the <?xml encoding?> in JSP syntax corresponds to a 'charset=' on 
the Content-Type header.  i.e. it's an output encoding on the page, not a 
way to read the input document.

> - leave current behavior - use default 8859-1 or pageEncoding only.
>

Of course, this is what RIs, like GlassFish are required to do :).  For 
Tomcat, I'm perfectly happy to have smart guessing as long as it doesn't 
override the declared <[EMAIL PROTECTED] pageEncoding="bla" %>.

> <?xml encoding?> is probably more used and supported ( i.e. more 
> 'standard'
> :-) that jsp pageEncoding.
> The jsp spec is clear that last option should be used - but having 2
> conflicting encodings is a source of problems,
> and if we can't follow the 'higher' standard, we can at least warn.
>
> Well - not a big deal, but encodings tends to be a headache area for many
> people, in particular
> when different parts of the system have different 'standards' and defaults
> plus autodetections ( on browser,  http, html,
> xml, or jsp ).
>
> Costin
>
>
> The old code should be improved to allow to use the sourceEnc when the
>> pageEncoding is not specified and ISO-8859-1 if none are specified.
>
>
>
>
>
>
> Cheers
>>
>> Jean-Frederic
>>
>> >
>> >Costin
>> >
>> >
>> >On 3/17/06, Bill Barker <[EMAIL PROTECTED]> wrote:
>> >
>> >
>> >>
>> >>
>> >>
>> >>>-----Original Message-----
>> >>>From: Costin Manolache [mailto:[EMAIL PROTECTED]
>> >>>Sent: Friday, March 17, 2006 11:57 AM
>> >>>To: Tomcat Developers List
>> >>>Subject: Re: svn commit: r386315 -
>> >>>/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
>> >>>rserController.java
>> >>>
>> >>>In his example ( where both XML and JSP declare encodings ) -
>> >>>which one
>> >>>would win ?
>> >>>
>> >>>
>> >>The patch only affects pages in JSP syntax, so the <?xml ... ?> is just
>> >>another piece of template text :).
>> >>
>> >>
>> >>
>> >>>IMO the XML encoding should win i.e. if the file uses xml
>> >>>syntax and starts
>> >>>with
>> >>><?xml version="1.0" encoding="iso-8859-2" ?>, then jsp
>> >>>pageEncoding should
>> >>>be ignored.
>> >>>If a jsp is written using the XML syntax - it is supposed to
>> >>>follow the XML
>> >>>rules - there is no
>> >>>exception in the XML spec for jsps specifying their different
>> >>>syntax for
>> >>>encoding.
>> >>>
>> >>>
>> >>>
>> >>The JSP expert group agrees with you:).  In XML syntax, the XML 
>> >>encoding
>> >>should win out over <jsp:directive.page pageEncoding="..." />.
>> >>
>> >>
>> >>
>> >>>For non-XML jsps - I think respecting pageEncoding is a must,
>> >>>the jsp reader
>> >>>must scan the
>> >>>file to find the pageEncoding string - which is not trivial (
>> >>>there is a
>> >>>reason why XML requires the
>> >>>encoding to be the first thing in the file, at the top, I
>> >>>would't bet on
>> >>>jasper implementing it correctly :-)
>> >>>
>> >>>
>> >>>
>> >>In JSP syntax, the spec (Appendix D) says that pageEncoding should win
>> (at
>> >>least when there is no matching <page-encoding /> in web.xml :).  What
>> the
>> >>patch breaks is that with it Jasper won't even look for the 
>> >>pageEncoding
>> >>most of the time.
>> >>
>> >>Jasper looks like it does a pretty good job of guessing to set up the
>> >>Reader
>> >>that scans for the pageEncoding directive.  And JFC seems to agree,
>> since
>> >>the patch is to use the guessed encoding rather than the one that was
>> >>specified :).
>> >>
>> >>
>> >>
>> >>>Costin
>> >>>
>> >>>On 3/17/06, Bill Barker <[EMAIL PROTECTED]> wrote:
>> >>>
>> >>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>>-----Original Message-----
>> >>>>>From: Jean-frederic Clere [mailto:[EMAIL PROTECTED]
>> >>>>>Sent: Friday, March 17, 2006 4:13 AM
>> >>>>>To: Tomcat Developers List
>> >>>>>Subject: Re: svn commit: r386315 -
>> >>>>>/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
>> >>>>>rserController.java
>> >>>>>
>> >>>>>Bill Barker wrote:
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>>-----Original Message-----
>> >>>>>>>From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
>> >>>>>>>Sent: Thursday, March 16, 2006 3:55 AM
>> >>>>>>>To: tomcat-dev@jakarta.apache.org
>> >>>>>>>Subject: svn commit: r386315 -
>> >>>>>>>/tomcat/jasper/tc5.5.x/src/share/org/apache/jasper/compiler/Pa
>> >>>>>>>rserController.java
>> >>>>>>>
>> >>>>>>>Author: jfclere
>> >>>>>>>Date: Thu Mar 16 03:54:29 2006
>> >>>>>>>New Revision: 386315
>> >>>>>>>
>> >>>>>>>URL: http://svn.apache.org/viewcvs?rev=386315&view=rev
>> >>>>>>>Log:
>> >>>>>>>If the encoding is not specified use the detected one.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>-1.
>> >>>>>>If it gets to this point, the detected encoding is *wrong*
>> >>>>>>
>> >>>>>>
>> >>>>>(e.g. <?xml
>> >>>>>
>> >>>>>
>> >>>>>>version="1.0" encoding="iso-8859-2" ?> in JSP syntax).
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>Why wrong?
>> >>>>>
>> >>>>>
>> >>>>Because the right encoding is the one specified in the <[EMAIL PROTECTED]
>> >>>>pageEncoding="utf8"%>.
>> >>>>
>> >>>>
>> >>>>
>> >>>>>+++
>> >>>>>Connected to localhost.
>> >>>>>Escape character is '^]'.
>> >>>>>GET /try1.jsp
>> >>>>><?xml version="1.0" encoding="ISO-8859-2"?>
>> >>>>><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
>> >>>>>   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd";>
>> >>>>>+++
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>This is about pageEncoding, so I don't see the relevance.
>> >>>>
>> >>>>
>> >>>>
>> >>>>>>I don't have access to an EBCDIC machine to know what the
>> >>>>>>
>> >>>>>>
>> >>>>>problem is, but
>> >>>>>
>> >>>>>
>> >>>>>>this isn't the fix.  Possibly a better way to guess the
>> >>>>>>
>> >>>>>>
>> >>>>>encoding of the
>> >>>>>
>> >>>>>
>> >>>>>>Reader?
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>Thinking to it  the patch is not prefect but the old code
>> >>>>>
>> >>>>>
>> >>>is worse we
>> >>>
>> >>>
>> >>>>>have a piece of code that detects correctly the  source
>> >>>>>
>> >>>>>
>> >>>encoding and
>> >>>
>> >>>
>> >>>>>detroy it...
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>However, the old code adheres to the JSP spec, whereas your
>> >>>>
>> >>>>
>> >>>patch breaks
>> >>>
>> >>>
>> >>>>the
>> >>>>JSP spec (Appendix D).  That automatically makes the old
>> >>>>
>> >>>>
>> >>>code better than
>> >>>
>> >>>
>> >>>>your patch.
>> >>>>
>> >>>>
>> >>>>
>> >>>>>In doParse() in ParserController.java the following happends
>> >>>>>parse() is called with pageEnc = sourceEnc
>> >>>>>jspConfigPageEnc = null
>> >>>>>isDefaultPageEncoding = false.
>> >>>>>But the line before the jspReader uses the sourceEnc to create the
>> >>>>>InputStreamReader so the content of the file is translated to
>> >>>>>utf-8 when
>> >>>>>reading it.
>> >>>>>In validator.java the charset will be set to the detected
>> >>>>>encoding... In
>> >>>>>the example above iso-8859.2. Bad for me that will be
>> >>>>>OSD_EBCDIC_DF04_1.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>The only issue is why Jasper can't recognize your <[EMAIL PROTECTED]
>> >>>>pageEncoding="OSD_EBCDIC_DF04_1" %> statement.  That's the
>> >>>>
>> >>>>
>> >>>part that I
>> >>>
>> >>>
>> >>>>can't
>> >>>>figure out (and your patch is masking :).
>> >>>>
>> >>>>
>> >>>>
>> >>>>>Cheers
>> >>>>>
>> >>>>>Jean-Frederic
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>This message is intended only for the use of the person(s)
>> >>>>>>
>> >>>>>>
>> >>>>>listed above as the intended recipient(s), and may contain
>> >>>>>information that is PRIVILEGED and CONFIDENTIAL.  If you are
>> >>>>>not an intended recipient, you may not read, copy, or
>> >>>>>distribute this message or any attachment. If you received
>> >>>>>this communication in error, please notify us immediately by
>> >>>>>e-mail and then delete all copies of this message and any
>> >>>>>
>> >>>>>
>> >>>attachments.
>> >>>
>> >>>
>> >>>>>>In addition you should be aware that ordinary (unencrypted)
>> >>>>>>
>> >>>>>>
>> >>>>>e-mail sent through the Internet is not secure. Do not send
>> >>>>>confidential or sensitive information, such as social
>> >>>>>security numbers, account numbers, personal identification
>> >>>>>numbers and passwords, to us via ordinary (unencrypted) e-mail.
>> >>>>>
>> >>>>>
>> >>>>>>
>> >>>>>>
>> >>>>---------------------------------------------------------------------
>> >>>>
>> >>>>
>> >>>>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>> >>>>>>For additional commands, e-mail: [EMAIL PROTECTED]
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>---------------------------------------------------------------------
>> >>>
>> >>>
>> >>>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>> >>>>>For additional commands, e-mail: [EMAIL PROTECTED]
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>This message is intended only for the use of the person(s)
>> >>>>
>> >>>>
>> >>>listed above as
>> >>>
>> >>>
>> >>>>the intended recipient(s), and may contain information that
>> >>>>
>> >>>>
>> >>>is PRIVILEGED
>> >>>
>> >>>
>> >>>>and CONFIDENTIAL.  If you are not an intended recipient,
>> >>>>
>> >>>>
>> >>>you may not read,
>> >>>
>> >>>
>> >>>>copy, or distribute this message or any attachment. If you
>> >>>>
>> >>>>
>> >>>received this
>> >>>
>> >>>
>> >>>>communication in error, please notify us immediately by
>> >>>>
>> >>>>
>> >>>e-mail and then
>> >>>
>> >>>
>> >>>>delete all copies of this message and any attachments.
>> >>>>
>> >>>>In addition you should be aware that ordinary (unencrypted)
>> >>>>
>> >>>>
>> >>>e-mail sent
>> >>>
>> >>>
>> >>>>through the Internet is not secure. Do not send
>> >>>>
>> >>>>
>> >>>confidential or sensitive
>> >>>
>> >>>
>> >>>>information, such as social security numbers, account
>> >>>>
>> >>>>
>> >>>numbers, personal
>> >>>
>> >>>
>> >>>>identification numbers and passwords, to us via ordinary
>> >>>>
>> >>>>
>> >>>(unencrypted)
>> >>>
>> >>>
>> >>>>e-mail.
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>---------------------------------------------------------------------
>> >>>
>> >>>
>> >>>>To unsubscribe, e-mail: [EMAIL PROTECTED]
>> >>>>For additional commands, e-mail: [EMAIL PROTECTED]
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>
>> >>This message is intended only for the use of the person(s) listed above
>> as
>> >>the intended recipient(s), and may contain information that is
>> PRIVILEGED
>> >>and CONFIDENTIAL.  If you are not an intended recipient, you may not
>> read,
>> >>copy, or distribute this message or any attachment. If you received 
>> >>this
>> >>communication in error, please notify us immediately by e-mail and then
>> >>delete all copies of this message and any attachments.
>> >>
>> >>In addition you should be aware that ordinary (unencrypted) e-mail sent
>> >>through the Internet is not secure. Do not send confidential or
>> sensitive
>> >>information, such as social security numbers, account numbers, personal
>> >>identification numbers and passwords, to us via ordinary (unencrypted)
>> >>e-mail.
>> >>
>> >>
>> >>---------------------------------------------------------------------
>> >>To unsubscribe, e-mail: [EMAIL PROTECTED]
>> >>For additional commands, e-mail: [EMAIL PROTECTED]
>> >>
>> >>
>> >>
>> >>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
> 




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to