Mark,

On 7/7/16 11:53 AM, Mark Thomas wrote:
> On 06/07/2016 22:55, Christopher Schultz wrote:
>> Mark,
>>
>> On 7/3/16 4:24 PM, ma...@apache.org wrote:
>>> Author: markt
>>> Date: Sun Jul  3 20:24:18 2016
>>> New Revision: 1751173
>>>
>>> URL: http://svn.apache.org/viewvc?rev=1751173&view=rev
>>> Log:
>>> The original request for regular expression support would be too expensive 
>>> to implement.
>>> This commit adds support for wild card host names.
>>> It adds overhead for requests where the Host header is not an exact match 
>>> since the code now has to convert the name in the header to the wild card 
>>> form and then search for that.
>>> However, this overhead is offset by caching the default host so it is not 
>>> necessary to do a look up for the default host.
>>> I've also expanded the performance tests. On my laptop the before and after 
>>> results are broadly similar with some small improvements and some small 
>>> increases.
> 
> <snip/>
> 
> 
>>> @@ -720,12 +747,24 @@ public final class Mapper {
>>>          MappedHost[] hosts = this.hosts;
>>>          MappedHost mappedHost = exactFindIgnoreCase(hosts, host);
>>>          if (mappedHost == null) {
>>> -            if (defaultHostName == null) {
>>> -                return;
>>> +            // Note: Internally, the Mapper does not use the leading * on a
>>> +            //       wildcard host. This is to allow this shortcut.
>>> +            int firstDot = host.indexOf('.');
>>> +            if (firstDot > -1) {
>>> +                int offset = host.getOffset();
>>> +                try {
>>> +                    host.setOffset(firstDot + offset);
>>> +                    mappedHost = exactFindIgnoreCase(hosts, host);
>>> +                } finally {
>>> +                    // Make absolutely sure this gets reset
>>> +                    host.setOffset(offset);
>>> +                }
>>>              }
>>> -            mappedHost = exactFind(hosts, defaultHostName);
>>>              if (mappedHost == null) {
>>> -                return;
>>> +                mappedHost = defaultHost;
>>> +                if (mappedHost == null) {
>>> +                    return;
>>> +                }
>>>              }
>>>          }
>>>          mappingData.host = mappedHost.object;
>>> @@ -1497,6 +1536,22 @@ public final class Mapper {
>>>      }
>>>  
>>>  
>>> +    /*
>>> +     * To simplify the mapping process, wild card hosts take the form
>>> +     * ".apache.org" rather than "*.apache.org" internally. However, for 
>>> ease
>>> +     * of use the external form remains "*.apache.org". Any host name 
>>> passed
>>> +     * into this class needs to be passed through this method to rename and
>>> +     * wild card host names from the external to internal form.
>>> +     */
>>> +    private static String renameWildcardHost(String hostName) {
>>> +        if (hostName.startsWith("*.")) {
>>> +            return hostName.substring(1);
>>> +        } else {
>>> +            return hostName;
>>> +        }
>>> +    }
>>> +
>>> +
>>>      // ------------------------------------------------- MapElement Inner 
>>> Class
>>>  
>>>  
>>>
> 
>> It's tough to tell from this diff... does the server take a performance
>> hit of wildcard-matching if no wildcards are in use?
> 
> I've removed the noise and left the key parts above.
> 
> Marginally.
> 
> The difference comes if no exact match to the requested host is found.
> 
> The old code then searched for the default host
> 
> The new code searches for a wildcard host and if not match is found
> returns the cached default host.
> 
> The new code is slightly slower in some cases and faster in others but
> the performance difference was marginal. It was the order of 10
> nanoseconds per request.
> 
>> My (long overdue) plan for doing regular-expression matching for things
>> like this was going to be to encapsulate the searching algorithm into a
>> few classes that would behave differently depending upon what needed to
>> be searched. For example, if all hostnames were explicitly-named, then
>> the existing binary search algorithm could be used, or if there was only
>> the one default host, it would always return that default host. For
>> regular expressions, the entire algorithm would change.
>>
>> I can't see from the diff where the change to the map() function is, so
>> I suspect it's quite a small change indeed, which is why I was thinking
>> that it might affect all requests even when wildcards weren't in use.
> 
> A 'wild cards in use flag' could be added that would improve things
> considerably since it would save an entire lookup. Any patch along those
> lines would need to be careful to keep track of whether or not wild
> cards were being used.

Yes, that was why I was thinking of having a separate class for each
search-type... basically, once a wildcard host had been added, we'd just
convert from the "simple" (and fast) search algorithm to the
wildcard-aware one. Switching back when all wildcard mappings had been
removed (e.g. host-manager) might be tricky but possible.

-chris

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to