2010/1/8 Mark Thomas <ma...@apache.org>:
> I think I have got to the bottom of why the EL has been so fragile and
> why the fixes to the various edge case bug fixes have invariably created
> new edge cases. It is probably easier to explain by considering an
> example. Using the echo tag from the newly added EL test cases consider
> the following JSP extract:
>
> <tags:echo echo="${1+1}" />
> <tags:echo echo="\${1+1}" />
> <tags:echo echo="\\${1+1}" />
>
> The JSP and EL specs are a little fuzzy/inconsistent but the guidance we
> received from the expert group was:
> - the whole attribute should be unquoted as per the rules for quoting
> JSP attributes
> - everything outside an actual expression after unquoting should be
> treated as a literal
>
> The issue is that we performed JSP attribute unquoting and EL parsing in
> two independent steps. This creates an ambiguity. If the attribute
> values above are unquoted then we get the following:
>
> ${1+1}
> ${1+1}
> \${1+1}
>
> The first is an EL expression and will be treated as such.
> The second is a literal. However, the EL parser will treat it as an
> expression.
> The third is a literal '\' followed by an expression but the EL parser
> will see an escaped literal.
>
> My proposed solution is to modify the output of the attribute unquoting
> to ensure the EL Parser correctly interprets the result. ie:
> ${1+1}
> ${'$'}{1+1}
> ${'\'}${1+1}
>

Some more notes regarding this, as I am reviewing docs and
implementation in view of issues reported for 6.0.24.

Especially, I was looking at chapter JSP.10.1.11 (XML View /
"Request-Time Attribute Expressions") and especially the table Table
JSP.10-2 (XML View of an Escaped EL Expression) in that chapter of JSP
2.1 specification. The question was how it avoids the pitfall of bug
48627 [1].  More on that below.

[1] https://issues.apache.org/bugzilla/show_bug.cgi?id=48627


First, for reference: r899148 is the revision when the new
implementation was applied to the 6.0 branch [2].

[2] http://svn.apache.org/viewvc?view=revision&revision=899148

Highlights for that implementation:
#1. It fixes several EL parsing issues.
#2. It implements '\' -> ${'\\'} or #{'\\'} (depending on the
expression type) replacement

#1. is mainly implemented by the changes in ELParser.jjt file
#2. is mainly implemented by the new AttributeParser.java class

Especially #1. fixes the issue, where '\' in a literal EL expression
(ch. 1.2.2 of EL 2.1 spec) was interpreted as an escape character
(actually, it is not an escape character there, only \${ and \#{ ).

E.g., evaluation of "\\\\" as an EL expression should give "\\\\" and
not "\\". It gave '\\' before r899148. I mentioned that issue in [3].

The catch point here is that if that EL parsing issue is fixed in #1.,
then total unconditional replacement '\' -> ${'\\'} or #{'\\'} of #2.
is not needed. That is, with #1 being fixed a single '\' is correctly
evaluated as '\' by EL and does not need escaping.

Look at Table JSP.10-2 inchapter JSP.10.1.11 of JSP 2.1 spec, that I
mentioned. The "XML View" column there is always a valid EL
expression, with all necessary EL escaping. The rule there is that if
the resulting expression is a literal (all ${foo} are escaped as
\${foo}), then the '\'s are remaining as they are. If the resulting
expression is a composite EL expression (there is an unescaped ${foo}
there), only in that case '\'s are replaced with ${'\\'}.

That is, in that table the only lines where ${'\\'} is there are those
where the result is an evaluatable EL expression.

That avoids the pitfall of BZ 48627 (where "\\" value of an attribute
became an EL expression in a tag where dynamic expressions where not
allowed): all non-dynamic attributes remain as non-dynamic.

That also avoids the uncertainty of whether '\' should be escaped as
${'\\'} or as #{'\\'}. As eval EL-expression is always there in those
cases, we always know what type of expression ($ or #) it is.

[3] http://marc.info/?t=126261733500006&r=1&w=2


Regarding the implementation, AttributeParser.java class. I think
that, based on the above, we can fix it to solve bug 48627. Other
parts of the new implementation will remain unchanged.


I have several comments on the AttributeParser class.

1. In JSP 2.1 spec there is an option to selectively disable '#'
expressions when '$' ones are still enabled. The name of that option
is "deferredSyntaxAllowedAsLiteral".

As of now, AttributeParser takes care of isELIgnored option, but does
not know about deferredSyntaxAllowedAsLiteral one.


2. There are several places in AttributeParser#parseLiteral() where

                if (type == 0) {
                    type = '$';
                }

That is where the bug 48627 lies. The above code turns non-dynamic
attribute in a dynamic one.


3. EL spec (ch.1.2.3 of EL 2.1 spec) says that "It is illegal to mix
${} and #{} constructs in a composite expression." though followed by
"This restriction may be lifted in future versions".

AttributeParser#parseLiteral() has the following clause:

} else if (ch == type){

I think it has to process '#' and '$' expressions in the same way, and
the "mix ${} and #{}" rule should be checked either explicitly here,
or elsewhere. I have not researched the question where it is actually
checked.


4. I have not researched this question, and I *can be wrong* in this
point, but I think that when EL is evaluated, the engine does not know
about  isELIgnored and deferredSyntaxAllowedAsLiteral options.

If that is the truth, then in AttributeParser#parseLiteral()
if (isELIgnored)  ${ should be escaped as \${
if (isELIgnored || deferredSyntaxAllowedAsLiteral)   #{ should be escaped as \#{

It will result in expression that can be directly evaluated where all
those unsupported expressions are escaped.  Are we already doing the
same escaping elsewhere when those options are in effect, or do we
carry those isELIgnored, deferredSyntaxAllowedAsLiteral options all
along with unescaped expressions until evaluation time?



Lastly,
when Mark was testing TC7 with JSP 2.2 TCK, he caught several minor EL
evaluation issues. Those are fixed in TC7, and I think some of them
have to be backported to TC6. Though, I have not reviewed them in
detail yet.  For reference, here are those revision numbers:
899635  <- I've already proposed this one for backport
899653
899769
899770
899783
899788
899792
899916
899918
899919
899935
899949

Best regards,
Konstantin Kolinko

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@tomcat.apache.org
For additional commands, e-mail: dev-h...@tomcat.apache.org

Reply via email to