[ 
https://issues.apache.org/jira/browse/COUCHDB-855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988056#action_12988056
 ] 

Filipe Manana commented on COUCHDB-855:
---------------------------------------

BenoƮt,

how did you measure those times? Are they averages? Were they measured in a 
concurrent or a non-concurrent scenario?

I tested with a single machine, running both CouchDB and the relaximation test, 
with the existing gen_server approach and your latest patch. The test was done 
like this:

$ node tests/test_writes.js --clients 500 --url http://localhost:5984/ 
--duration 90

For each one of the alternatives (gen_server vs no gen_server), I added an 
io:format, in couch_httpd.erl, call like this:

+    T0 = erlang:now(),
+    MochiReq1 = couch_httpd_vhost:dispatch_host(MochiReq),
+    io:format("vhost lookup took ~p~n", [timer:now_diff(erlang:now(), T0)]),


Now, the basic analysis I did:


>>> without the gen_server (latest patch)

http://graphs.mikeal.couchone.com/#/graph/0379dbdaef29b1c0fbf0342154019ed3

$ egrep 'vhost lookup' log_vhosts_no_server | wc -l
64251

20 more frequent times

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | sort -n | uniq 
-c | sort -nr | head -20
    201 33
    140 34
    121 36
    115 35
    108 32
    103 37
     97 43
     92 41
     86 42
     82 44
     79 40
     77 45
     76 39
     73 38
     68 47
     67 48
     58 46
     57 50
     54 52
     54 49

20 less frequent times

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | sort -n | uniq 
-c | sort -nr | tail -20
      1 10046
      1 10043
      1 10042
      1 10037
      1 10036
      1 10035
      1 10032
      1 10029
      1 10028
      1 10022
      1 10018
      1 10014
      1 10013
      1 10011
      1 10009
      1 10008
      1 10005
      1 10003
      1 10002
      1 10000

number of lookups that took less than 50us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 50' | wc -l
1704

(about 2,7%)

number of lookups that took less than 100us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 100' | wc -l
3374

(about 5,3%)

number of lookups that took less than 200us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 200' | wc -l
5921

(about 8,2%)

number of lookups that took less than 500us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 500' | wc -l
14341

(about 22,3%)

number of lookups that took less than 1000us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 1000' | wc -l
24259

(about 37,8%)

number of lookups that took less than 2000us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 2000' | wc -l
35153

(about 54,7%)

number of lookups that took less than 3000us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 3000' | wc -l
41630

(about 64,8%)

number of lookups that took less than 4000us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 4000' | wc -l
46452

(about 72,3%)

number of lookups that took less than 5000us

$ egrep 'vhost lookup' log_vhosts_no_server | cut -d ' ' -f 4 | perl -ne 'print 
$_ if $_ < 5000' | wc -l
49603

(about 77,2%)



>>> with the gen_server (current trunk)


http://graphs.mikeal.couchone.com/#/graph/0379dbdaef29b1c0fbf034215401b7d4

$ egrep 'vhost lookup' log_vhosts_gen_server | wc -l
62736

20 more frequent response times

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | sort -n | uniq 
-c | sort -nr | head -20
    198 33
    156 34
    118 32
    116 35
    115 36
    110 37
    109 42
    104 44
     94 41
     92 39
     91 43
     87 45
     86 40
     78 46
     66 51
     65 38
     62 48
     61 49
     61 47
     59 62


20 less frequent response times

egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | sort -n | uniq 
-c | sort -nr | tail -20
      1 10047
      1 10045
      1 10042
      1 10041
      1 10033
      1 10032
      1 10030
      1 10028
      1 10026
      1 10024
      1 10020
      1 10019
      1 10016
      1 10013
      1 10010
      1 10009
      1 10008
      1 10007
      1 10003
      1 10000


number of lookups that took less than 50us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 50' | wc -l
1818

(about 2,8%)

number of lookups that took less than 100us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 100' | wc -l
3443

(about 5,4%)

number of lookups that took less than 200us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 200' | wc -l
6150

(about 9,8%)

number of lookups that took less than 500us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 500' | wc -l
14307

(about 22,9%)

number of lookups that took less than 1000us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 1000' | wc -l
23908

(about 38,1%)

number of lookups that took less than 2000us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 2000' | wc -l
34275

(about 54,6%)

number of lookups that took less than 3000us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 3000' | wc -l
40457

(about 64,5%)

number of lookups that took less than 4000us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 4000' | wc -l
44858

(about 71,5%)

number of lookups that took less than 5000us

$ egrep 'vhost lookup' log_vhosts_gen_server | cut -d ' ' -f 4 | perl -ne 
'print $_ if $_ < 5000' | wc -l
47775

(about 76,1%)



I made a few runs for each approach, and the total number of requests served by 
the gen_server approach is always smaller (about 1000 to 2000 less) compared to 
the non-gen_server approach (perhaps it's a coincidence)

I tend to favour the non-gen_server approach. A more complete and realistic 
test would consist of having several machines doing thousands of requests in 
parallel against the Couch server. Unfortunately I only have one right now and 
will be out for the next 2 weeks.

Let's see what others think of this. Paul?

> New host manager
> ----------------
>
>                 Key: COUCHDB-855
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-855
>             Project: CouchDB
>          Issue Type: Improvement
>    Affects Versions: 1.1
>            Reporter: Benoit Chesneau
>            Assignee: Benoit Chesneau
>            Priority: Blocker
>             Fix For: 1.1
>
>         Attachments: 
> 0001-New-vhost-manager.-allows-dynamic-add-of-vhosts-with.patch, 
> 0001-Squashed-commit-of-the-following.patch, COUCHDB-855-1.patch, 
> COUCHDB-855-2.patch
>
>
> New vhost manager. allows dynamic add of vhosts without restart, wildcard in 
> vhost and specific functions in erlang by kind of domain. It also fix issue 
> in etap test (160) .
> Find attached to this ticket the patch. It is also available in my github 
> repo :
> http://github.com/benoitc/couchdb/commit/435c756cc6e687886cc7055302963f422cf0e161
> more details :
> This gen_server keep state of vhosts added to the ini and try to
> match the Host header (or forwarded) against rules built against
> vhost list. 
> Declaration of vhosts take place in the configuration file :
> [vhosts]
> example.com = /example
> *.example.com = /example
> The first line will rewrite the rquest to display the content of the
> example database. This rule works only if the Host header is
> 'example.com' and won't work for CNAMEs. Second rule on the other hand
> match all CNAMES to example db. So www.example.com or db.example.com
> will work.
> The wildcard ('*') should always be the last in the cnames:
>      "*.db.example.com = /"  will match all cname on top of db
> examples to the root of the machine. (for now no rewriting is
> possible).
> By default vhosts redirection is handle by the function
> couch_httpd_vhost:redirect_to_vhost, but you could pass per vhost a
> specific function :
>      "*.domain.com" = {Module, Func}"
> The function take the Mochiweb Request object and should return a new
> Mochiweb Request object.
> You could also change the default function to handle request by
> changing the setting `redirect_vhost_handler` in `httpd` section of
> the Ini:
>       [httpd]
>       redirect_vhost_handler = {Module, Fun}
> The function take 2 args : the mochiweb request object and the target
> path. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to