Source: openssh
Version: 1:10.3p1-3

Hi,

I found SSH connections to a remote server to be consistently slow
(~15 seconds to get a shell) despite the server being responsive and
geographically close.

The user database for the system in question is on LDAP, and it is quite
large. `getent passwd` takes ~4-5 seconds. However, I expected PAM user
lookups to be targeted to a single user, and `getent passwd ema` is very
fast.

It took a while to figure out that the full user database was being
dumped instead, multiple times. This is due to getpwent being called in
user-group-modes.patch. Even setting the highest possible sshd LogLevel
(DEBUG3) was not too helpful, no specific clues were given other than
the fact that somehow parsing authorized_keys took several seconds:

09:25:13 hostname sshd[3242229]: debug1: trying public key file 
/home/ema/.ssh/authorized_keys
09:25:13 hostname sshd[3242229]: debug1: fd 5 clearing O_NONBLOCK

And:

09:25:18 hostname sshd[3242229]: debug1: /home/ema/.ssh/authorized_keys:3: 
matching key found

After finding the code path in user-group-modes.patch that calls
getpwent, I changed the permissions of ~/.ssh to 700, and all files
within to 600. getpwent is now not being called anymore, and I am
getting a shell without performance issues.

In the process of debugging this problem, I discovered that nscd does
not cache getpwent calls by design, exacerbating the issue:
https://sourceware.org/pipermail/libc-alpha/2006-August/020774.html

While I do understand the reason for deviating from upstream behavior, I
also wonder if there could be any way to either avoid potentially
expensive getpwent calls, or at least inform the user about what is
going on?

Thanks,
  ema

Reply via email to