Hi Alberto,

Thank you for your quick reply ! I'll apply the same patch in order to
spare the same confusion from other people in the future might they bump
into the same code :)

Cheers,
Amit

On Fri, Jul 20, 2012 at 10:15 PM, Alberto Massari <
[email protected]> wrote:

> Hi Amit,
> thanks for spotting this; even if the code was not going to crash (as the
> first character after the authority is still in the allocated memory and
> must be a / or a NULL, and both would have failed the isHex test, avoiding
> that the second isHex would access memory after the end of the string) I
> added the extra check to make it clear.
>
> Thanks,
> Alberto
>
> Il 19/07/2012 23:00, Amit K ha scritto:
>
>> Hi,
>>
>>
>> I am using the C++ distribution of xerces 2.7, the implementation of
>> the function in question in it is:
>>
>>
>> bool XMLUri::**isValidRegistryBasedAuthority(**const XMLCh* const
>> authority,
>>
>>                                             const int authLen)
>> {
>>      // check authority    int index = 0;
>>      while (index < authLen)
>>      {
>>          if (isUnreservedCharacter(**authority[index]) ||
>>              (XMLString::indexOf(REG_NAME_**CHARACTERS,
>> authority[index]) != -1))
>>          {
>>              index++;
>>          }
>>          else if (authority[index] == chPercent)               // '%'
>>    {
>>              *if (XMLString::isHex(authority[**index+1]) &&     // 1st
>>
>> hex                XMLString::isHex(authority[**index+2])  )     // 2nd
>> hex                index +=3;*
>>
>>              else
>>                  return false;
>>          }
>>          else
>>              return false;
>>      } //while
>>      return true;
>> }
>>
>>
>> I've boldened the lines which I want to further discuss here.
>>
>> and also note that I've seen that the implementation is the same in
>> the latest version of xerces.
>>
>> However, I noticed that the Java implementation is different in
>> relation to the same lines:
>>
>> /**
>>    +   * Determines whether the given string is a registry based
>> authority.
>>    +   *
>>    +   * @param authority the authority component of a URI
>>    +   *
>>    +   * @return true if the given string is a registry based authority
>>    +   */
>>    +  private boolean isValidRegistryBasedAuthority(**String authority) {
>>    +    int index = 0;
>>    +    int end = authority.length();
>>    +    char testChar;
>>    +
>>    +    while (index < end) {
>>    +      testChar = authority.charAt(index);
>>    +
>>    +      // check for valid escape sequence
>>    +      if (testChar == '%') {
>>    +        *if (index+2 >= end ||
>>    +            !isHex(authority.charAt(index+**1)) ||
>>    +            !isHex(authority.charAt(index+**2))) {*
>>
>>    +            return false;
>>    +        }
>>    +        index += 2;
>>    +      }
>>    +      // can check against path characters because the set
>>    +      // is the same except for '/' which we've already excluded.
>>    +      else if (!isPathCharacter(testChar)) {
>>    +        return false;
>>    +      }
>>    +      ++index;
>>    +    }
>>    +    return true;
>>    +  }
>>
>>
>> The important difference is of course, the bounds check on the string
>> / character array. Why is it omitted in the C++ version ?
>>
>> I thought it might be because the string can be assume to be null
>> terminated in the C++ version.. but I can't be sure whether it's just
>> a bug or not.
>>
>> I would thank your reply
>>
>>
>> Sincerely,
>>
>> Amit
>> .
>>
>>
>

Reply via email to