[ 
https://issues.apache.org/jira/browse/GEODE-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046225#comment-17046225
 ] 

ASF subversion and git services commented on GEODE-6536:
--------------------------------------------------------

Commit 9da2cd49e2e04564b446eaad579b51e986bc2179 in geode's branch 
refs/heads/develop from Mario Ivanac
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=9da2cd4 ]

GEODE-6536: Added retry in borrowConnection/single hop (#4719)

* GEODE-6536: Added retry in borrowConnection/single hop

* GEODE-6536: bug fix

* GEODE-6536: update after comments


> client pr single hop will do multiple hops if things get busy
> -------------------------------------------------------------
>
>                 Key: GEODE-6536
>                 URL: https://issues.apache.org/jira/browse/GEODE-6536
>             Project: Geode
>          Issue Type: Improvement
>          Components: client/server
>            Reporter: Darrel Schneider
>            Assignee: Mario Ivanac
>            Priority: Major
>              Labels: needs-review, performance, pull-request-available
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> I was wondering why ConnectionManagerImpl.borrowConnection that targets a 
> particular server will not wait for a connection to be available. This is the 
> borrow used by pr single hop. It seems like if all the cnxs to that server 
> are busy and we are at max that it should wait. But instead it does a single 
> scan of the pool and if it does not find an idle connection it either throws 
> AllConnectionsInUse or it creates one. The caller sets the onlyUseExistingCnx 
> parameter based on if the pool is at max cnxs or not (seems like this should 
> have been done in borrowConnection instead of the caller). If it throws 
> AllConnectionsInUse the caller catches it and ignores it and then falls into 
> doing a non-targeted borrow that will wait for a connection but to ANY 
> server. So our single hop optimization goes away on a busy client. Since it 
> can’t immediately get a connection to server A, it says just give me a 
> connection to any server sends the op to that other server and that server 
> ends up having to forward it to server A. You would only see this problem if 
> you set a max on your client connection pool. The default of -1 just says to 
> always create another connection. The only workaround I know of is to not 
> limit how many connections your client can create but that could cause other 
> problem (i.e. too many file descriptors).
> The questionable code is in the following classes:
> org.apache.geode.cache.client.internal.GetOp
> org.apache.geode.cache.client.internal.PutOp
> org.apache.geode.cache.client.internal.DestroyOp
> The all have something like this:
>  
> {code:java}
> boolean onlyUseExistingCnx = ((poolImpl.getMaxConnections() != -1
>  && poolImpl.getConnectionCount() >= poolImpl.getMaxConnections()) ? true : 
> false);
>  op.setAllowDuplicateMetadataRefresh(!onlyUseExistingCnx);
>  return pool.executeOn(new ServerLocation(server.getHostName(), 
> server.getPort()), op,
>  true, onlyUseExistingCnx);
> {code}
> executeOn eventually calls ConnectionManagerImpl.borrowConnection 
> It is unclear to me why they need allow duplicate metadata refresh if they 
> permit borrowConnection to create a new connection to the specified server. 
> They all catch and ignore 
> AllConnectionsInUseException falling through and doing non-targetted 
> pool.execute.
> The concern might have been that the pool would be full of connections to 
> other servers preventing us from connecting to the server we need to 
> communicate with. If we have existing connections to the server we could wait 
> for one of those. If we don't and we are at max then we should consider 
> closing a connection to some other server and create a connection to the 
> server we know we need for this operation. As the server count goes up and 
> the max connection count on the pool goes down I'm sure we are probably 
> better off not even trying to do single hop since we might not even be able 
> to have a connection to each server in the pool. But if we have a reasonable 
> max connection count then it seems like we could make more of an effort to 
> make sure a per server connection count is balanced and then wait for one of 
> those connection to be available.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to