Hi Everyone.

Let me revisit this old topic that has been discussed in 2016 with the 
Subversion team. As our tool finally added support for SVN 1.10 version, we 
have a breakthrough in this issue after all these years. And I feel it is 
something the Subversion team should be aware of.

There are two topics I’d like to bring up here, as they are inter-connected:

1) Svnserve causing mutex lock contention in threaded mode

  *   Discussed in 
http://subversion.1072662.n5.nabble.com/Better-choice-for-Linux-semaphore-than-spinlock-td204915.html#a204989
  *   This report I believe originates at our enterprise customer, who has seen 
this behavior at high concurrent usage. Simply when svnserve on Linux (RHEL 
7.6) is configured to run in threaded mode, then we start seeing the following 
pattern. All the CPU time is consumed by the concurrent usable. In the worst 
case that we have ever seen, almost all the CPU is consumed by system time, 
presumably related to that spinlock contention discussed in the thread above.
  *   According to our developer analysis, svnserve behaves very differently 
depending on the fact if there are enough threads in the svnserve pool or not. 
Quoting our dev: “Svnserve waits on socket read if there is enough threads in 
pool. But it behaves a bit differently if more than half of threads from pool 
are occupied by work. In that case, it immediately returns thread after each 
command/operation back to the pool which is again trying to get out of pool as 
there is a lot of work to do – and this is point of locking. It does also 
processing using round-robin, which will intentionally prolong connection 
operations I think, trying to reduce load back to normal state. Otherwise, if 
there are enough threads in pool (lets say by default under 128 threads), these 
active threads are not returned back to pool after each command and just 
processing next commands in command queue for given connection and it is fully 
concurrent without lock.”
  *   So now when we finally added support for SVN 1.10 that added new 
configuration options for tuning the number of the threads, we were able to do 
more experiments based on Stefan Fuhrmann’s 
recommendation<http://subversion.1072662.n5.nabble.com/Deadlock-like-behaviour-of-svnserve-in-multi-threaded-mode-T-tp196421p196500.html>.
 When we apply the recommended tuning options –min-threads 64 –max-threads 
1024, the situation improves significantly. See the figures below.

Svnserve in threaded mode – no threads tuning
[cid:image002.jpg@01D61561.8913B730]

Svnserve in threaded mode – no threads tuning, worst case
[cid:image008.jpg@01D61561.8913B730]

Svnserve in threaded mode – tuned –min-threads 64 –max-threads 1024
[cid:image009.jpg@01D61561.8913B730]


  *   Conclusion here is that svnserve from version 1.10 can be configured to 
support the necessary concurrency, but there is lack of guidance and potential 
logging that can lead admins to proper configuration. So bring it up here to 
your consideration, if you want to process this feedback.

2) Deadlock-like behaviour of svnserve in multi-threaded mode (-T)

  *   The second problem is also related to threaded mode and we attacked this 
for the third time as it was significant robustness problem that caused 
stalling of our application with hundreds of concurrent users and therefore was 
escalated by our enterprise customers
  *   Discussed in 
http://subversion.1072662.n5.nabble.com/Deadlock-like-behaviour-of-svnserve-in-multi-threaded-mode-T-tt196421.html#a196500
 and also tracked as https://issues.apache.org/jira/browse/SVN-4626

  *   Our scenario that leads to this problem is the following: Our tool, 
Polarion ALM, at times performs a re-indexing operation, where it pulls a lot 
of data from SVN in parallel connections. Also it is being used by hundreds of 
concurrent users and at times, also concurrent usage and subsequent connection 
creation leads to this problem. The newly created connections to svnserve stall 
completely and due to internal locking in Polarion, all communication to our 
backend stops until a timeout on the connection occurs minutes later.
  *   This stalling occurs when multiple SVN connections are opened at the same 
time and only when SVN is running in threaded mode. This is default on Windows, 
and can be enabled by configuration on Linux.
  *   We involved the svnkit team in the latest analysis and Alex Kitaev 
provided very good help. Let me again quote Alex: “I started to reduce number 
of parallel threads and when issue was reproducible with even two threads I've 
realized that the problem might be related to socket.connect call rate. 
Somehow, frequent connection establishment led to failues - connection state 
was displayed as Established, but no data was read from it. So, the workaround 
I've found so far, is to make sure SVNRepository instances are created 
subsequently along with "testConnection" call on the insance, with minimal 
delay between "testConnection" calls. In my pure socket test a delay of 10ms 
resolved the problem, with SVNRepository just subsequent calls to 
repository.testConnection was enough, due to testConnection call overhead. I 
didn't find any reference to this or similar issue on the internet, but I 
suspect that it might be either Windows configuration option, or APR used by 
Subversion that might use current time for some sort of socket/connection id 
and then mixes sockets up.”
  *   So this analysis points to a potential problem that may cause this 
stalling on svn side. We were able to workaround this problem by introducing 
“rate limiting” when creating new SVN connections the prevents the concurrent 
creation of the connection. FYI our app creates some hundreds of connections in 
enterprise environments.

In conclusion – currently we are able to overcome both of these long standing 
problems after adding support for SVN 1.10. We just wanted to share our 
findings, so that the SVN team is aware of them. We understand our usage of SVN 
is bit special, but I feel our findings may help making the SVN bit better and 
prevent problems at other users.

Thank you SVN team for your support!

Best regards,
Radek Krotil

Siemens Digital Industries Software
Polarion ALM Product Management
polarion.plm.automation.siemens.com<https://polarion.plm.automation.siemens.com/>


-----------------
Siemens Industry Software, s.r.o.
Praha 4, Doudlebská 1699/5, PSČ 140 00
IČ 256 51 897
Zapsaná v obchodním rejstříku vedeném Městským soudem v Praze, oddíl C, vložka 
58222

Důležité upozornění: Tato zpráva má jen informativní charakter. Obsah této 
zprávy odesílatele nezavazuje a odesílatel nemá v úmyslu touto zprávou uzavřít 
smlouvu, přijmout nabídku, potvrdit uzavření smlouvy ani nezakládá předsmluvní 
odpovědnost jejího odesílatele, ledaže je odesílatelem ve zprávě uvedeno 
výslovně jinak. Obsah této zprávy (včetně příloh) je důvěrný. Pokud nejste 
zamýšleným adresátem této zprávy, zpřístupnění, kopírování, distribuce nebo 
užití obsahu zprávy je přísně zakázáno a v takovém případě, prosím, okamžitě 
informujte odesílatele a poté zprávu (vč. příloh) odstraňte z Vašeho systému.

Important Note: This message is only of informative nature. The content of this 
message shall not be binding for sender and sender does neither intend to 
conclude contract, accept offer or confirm the conclusion of the contract by 
this message nor this message represents pre-contractual liability of the 
sender, unless the sender states in the message excplicitly otherwise. The 
content of this message (including appendices) shall be confidential. Should 
you are not intended receiver of this message, any access, copying, 
distribution or use of the content of this message is strictly prohibited and 
in such case, please immediately notify the sender and subsequently delete the 
entire message (including apppendices) from your system.

Reply via email to