Launchpad has imported 39 comments from the remote bug at https://bugzilla.mindrot.org/show_bug.cgi?id=1213.
If you reply to an imported comment from within Launchpad, your comment will be sent to the remote bug automatically. Read more about Launchpad's inter-bugtracker facilities at https://help.launchpad.net/InterBugTracking. ------------------------------------------------------------------------ On 2006-07-26T04:17:58+00:00 Tryponraj wrote: Hello All, Im using OpenSSH 4.3p2 and tyring to scan a list of 40 machines in my network with ssh-keyscan utility. I used the following command, ssh-keyscan -t rsa -f hosts.txt The man page says that this utility displays the host keys rrespective of ssh or host is up/down and its working great. But in case if the scan stops at 30th host due to some protocol problems, the utility exits and don't display the host keys for remaining machines. I think this is an expected behaviour, but it would be better to ignore that host continue till the end or atleast this can be documented specifically in the man page. I digged up this problem further and find my results below. ssh-keyscan ignores the hosts if they are not up or sshd is not running when used with -f <file> option. But when it encounters any error while retrieving the host key from the machine which is up and have sshd running,it simply exits. This may happen due to transport layer implementation in packet.c at packet_read_poll_seqnr() which results in exiting. My guess is that as packet.c is utilised by all OpenSSH utilities including ssh-keyscan, we can't make ssh-keyscan to continue with remaining hosts as specified in -f <files> in case of an error. But I also vote for atleast documenting this one. Detailed debug traces are given below: -------------------------------------- # ssh-keyscan -vvv -t rsa host.server.com debug2: fd 3 setting O_NONBLOCK debug1: no match: mpSSH_0.1.0 # host.server.com SSH-2.0-mpSSH_0.1.0 debug1: Enabling compatibility mode for protocol 2.0 debug3: RNG is ready, skipping seeding debug1: SSH2_MSG_KEXINIT sent Received disconnect from 16.245.97.226: 11: SSH Disabled # ssh -vvv host.server.com OpenSSH_4.3p2-hpn, OpenSSL 0.9.7i 14 Oct 2005 HP-UX Secure Shell-A.04.30.005, HP-UX Secure Shell version debug1: Reading configuration data /opt/ssh/etc/ssh_config debug3: RNG is ready, skipping seeding debug2: ssh_connect: needpriv 0 debug1: Connecting to host.server.com [16.245.97.226] port 22. debug1: Connection established. debug1: permanently_set_uid: 0/3 debug1: identity file /.ssh/identity type 0 debug3: Not a RSA1 key file /.ssh/id_rsa. debug2: key_type_from_name: unknown key type '-----BEGIN' debug3: key_read: missing keytype debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug2: key_type_from_name: unknown key type '-----END' debug3: key_read: missing keytype debug1: identity file /.ssh/id_rsa type 1 debug3: Not a RSA1 key file /.ssh/id_dsa. debug2: key_type_from_name: unknown key type '-----BEGIN' debug3: key_read: missing keytype debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug3: key_read: missing whitespace debug2: key_type_from_name: unknown key type '-----END' debug3: key_read: missing keytype debug1: identity file /.ssh/id_dsa type 2 debug1: Remote protocol version 2.0, remote software version mpSSH_0.1.0 debug1: no match: mpSSH_0.1.0 debug1: Enabling compatibility mode for protocol 2.0 debug1: Local version string SSH-2.0-OpenSSH_4.3p2-hpn debug2: fd 4 setting O_NONBLOCK debug3: RNG is ready, skipping seeding debug1: SSH2_MSG_KEXINIT sent Received disconnect from 16.245.97.226: 11: SSH Disabled Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/0 ------------------------------------------------------------------------ On 2006-10-01T20:39:30+00:00 Paul Wouters wrote: I was going to open a new bug report, but I think I am reporting the same bug as this one. ssh-keyscan aborts when it encounters glue without the proper authoritative data. eg: hostname.domain.com IN NS hostname.domain.com hostname.domain.com IN A 1.2.3.4 Where hostname.domain.com is itself not running a namserver. It is correct in not processing this entry, as the glue is non-authoritative data, and cannot be confirmed by the nameserver ot the child zone. However, ssh-keyscan should just skip this entry, not abort. I noticed this when writing ftp://ftp.xelerance.com/sshfp/ which is a python script that can use ssh-keyscan (or known_hosts files) to generate SSHFP records. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/1 ------------------------------------------------------------------------ On 2007-03-13T05:00:18+00:00 Senthilkumar-sen wrote: Is there any chance that this bug will get fixed for the next release? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/2 ------------------------------------------------------------------------ On 2010-11-23T01:00:50+00:00 Aab wrote: Created attachment 1961 One attempt at getting the rsa key from a remote server that was having a number of problems. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/4 ------------------------------------------------------------------------ On 2010-11-23T01:04:06+00:00 Aab wrote: I believe I've encountered the same or similar ssh-keyscan problem. local ssh - OpenSSH_5.1p1 Debian-5, OpenSSL 0.9.8g 19 Oct 2007 remote ssh - OpenSSH_4.3p2, OpenSSL 0.9.8e-fips-rhel5 01 Jul 2008 The remote server was having "problems": 1) no connection; 2) connection and key returned; or 3) connection but hanging until remote time out and disconnect. With the latter, ssh-keyscan aborted immediately with exit-code=255 (see attachment). I disagree with the original poster in that I think that ssh-keyscan should continue in all cases except for an internal error. In our case, ssh-keyscan is buried several layers deep in wrapper scripts where it is being fed (today) 3690+ host names. Per the man pages, I was expecting it to continue regardless of what the remote servers did or didn't do. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/5 ------------------------------------------------------------------------ On 2010-12-03T00:19:46+00:00 Aab wrote: Created attachment 1969 Fix(?) for premature ssh-keyscan abort. This adds a local/static `cleanup_exit()' function to ssh-keyscan so that aborts in non-ssh-keyscan code can be converted to "continue"s while the `dispatch_run()' function is being executed. It mimics the already extant local/static `fatal()' function in using `exit()' instead of the `_exit()' used in the default cleanup.c. Two observations: 1) I also incremented the `howmany()' argument #1 count by 1. This is probably unnecessary but I note that all other occasions where `howmany()' is used do this (and I'm chicken ...). 2) The current local/static `fatal()' function could possibly be removed and the default one, defined in fatal.c, be used. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/6 ------------------------------------------------------------------------ On 2011-02-17T15:36:54+00:00 Count-mindrot wrote: I'm running into the same problem on recent versions. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/7 ------------------------------------------------------------------------ On 2011-02-17T18:31:44+00:00 Count-mindrot wrote: btw: I've elevated this to 'major', as it completely breaks the usefulness for ssh-keyscan in large networks, as the error condition (len == 0 in packet_read_seqnr() in packet.c; resulting in logit("Connection closed ... etc") and cleanup_exit(255);) is much easier to hit. On 10 runs of ssh-keyscan over ~3800 IPs I couldn't get a single complete run without hitting this. Please fix. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/8 ------------------------------------------------------------------------ On 2011-02-17T23:58:23+00:00 Aab wrote: Mr. Kotes, I have a patch against openssh-5.[678]p1 for our problem that could be called a workaround or a fix depending on your way of looking at it. The probable reason that `packet_read_seqnr()' gets the len==0 is that one of the IPs from which your attempting to get a key has a bad `sshd' server that times out because of the "LoginGraceTime". This, in turn, causes almost all of the other servers that have open sockets at that time to "LoginGraceTime" out as well. To back up a bit, `packet_read_seqnr()' calls the vanilla `cleanup_exit()' that in the current ssh-keyscan aborts immediately rather than continuing like ssh- keyscan's `fatal()' call does. This is part 1 of the fix. The second part is to teach ssh-keyscan how to deal with the problem when a bad server times out. My patch does both although the code seems a bit kludgy to me. Unfortunately, we haven't had a bad server recently so I can't completely test the patch (I'm using it in test mode now) and, until then, I don't want to send it to the OpenSSH folks. FWIW - our host farm is 3500+ with an additional 1200+ to be online soon and probably more in the late summer. In my opioion, this should be marked as a bug against the current openssh variant. How do I go about doing that? If you'd like to have a copy of the current patch so you can test it, please tell me where to send it. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/9 ------------------------------------------------------------------------ On 2011-02-18T00:04:12+00:00 Aab wrote: I've noted that this is a ssh-keyscan bug and I've attached it to the openssh-5.8p1 release. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/10 ------------------------------------------------------------------------ On 2011-02-18T00:07:01+00:00 Aab wrote: Oops, can't read. ssh-keygen ain't ssh-keyscan. Changed the component back to Miscellaneous. Hey, isn't ssh-keyscan a component also? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/11 ------------------------------------------------------------------------ On 2011-02-23T15:55:54+00:00 Daniel Richard G. wrote: I reported this a while ago on the Ubuntu Launchpad bug tracker: https://bugs.launchpad.net/openssh/+bug/483928 I've also confirmed that the bug persists in OpenSSH 5.8p1, and I gave your patch a try to scan a corporate network of 6000+ hosts. Most of the hosts don't appear to be running SSH, but I can't be sure if that's really the case, or if ssh-keyscan(1) is bugging out on many of the connections. It does run through to the end of the list, but with some anomalies, like "Connection closed by A.B.C.D" or "Received disconnect from A.B.C.D: 2: Client Disconnect" messages that crop up multiple times for the same IP address. Is it possible that one bad connection can still take down active good connections, even with this patch? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/13 ------------------------------------------------------------------------ On 2011-02-23T17:32:02+00:00 Aab wrote: Ummm. If you're referring to the "original" patch that I submitted, It's out-of-date. It was written before I had a complete(?) handle on what was going wrong. Included with this comment is an attachment with the newer patch against the openssh-5.8p1 source. A bit of explanation. Some of the mods are for clarity. When your working, as we are, with a large number of hosts, "socket" doesn't tell you very much as to where the problem is occuring. Same with "Bad hostkey alg". In the patch, I've attempted to allow `ssh-keyscan' to continue if the encountered problem is external in origin. Some of the items that you noticed are (I think) addressed by this patch. NOTE - NOTE - NOTE - this patch has NOT been completely verified. The closed by remote because of LoginGraceTime" outs needs a bad remote server so that that can be done. Unfortunately, all of our servers are playing nice-nice at present. I did have an earlier buggy variant of the patch that "tried" to execute the patch code but I screwed up and generated an infinite loop instead. The basic code is running as the `ssh-keyscan' of choice in our setup. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/14 ------------------------------------------------------------------------ On 2011-02-23T17:40:19+00:00 Aab wrote: Created attachment 2000 openssh-5.8p1 - patch for ssh-keyscan Is this comment different from the other one???? Later (better?) patch to fix `ssh-keyscan's premature aborting observed in large network scans. Hopefully, there are sufficient comments in the code to describe the fix. Please ask if you find something annoying. I also have patches for 5.6p1 and 5.7p1. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/15 ------------------------------------------------------------------------ On 2011-02-24T15:50:54+00:00 Daniel Richard G. wrote: With this updated patch, I'm seeing at least twice as many host keys returned than before (up to ~2400, from ~1000), and the "multiple errors from the same IP" oddness is gone now. The more-specific error messages are very helpful. I do notice that hosts which are firewalled or otherwise fail to yield a server banner are not cited with an error message to stderr. I think this would be useful if it can be done, that every host listed in the input is spoken for one way or the other in the output, because that way you can be sure that no host is being silently dropped by the program. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/16 ------------------------------------------------------------------------ On 2011-02-27T02:24:50+00:00 Aab wrote: Created attachment 2005 Upgraded(?) patch to include extra ssh-keyscan logging. Try this to log all attempt failures. I put it under control of a command line option, '-L'. One failure noted by ssh-keyscan is the ECONNREFUSED that I think should have caused a standard error message to be elided. Except for the ECONNREFUSED, all of the new messages are written by the `logit()' function. FWIW - this patch may or may not obsolete the patch supplied with attachment 2000 so I didn't check the obsolete:2000 box. I didn't test this patch out very thoroughly but what testing I did showed what I wanted. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/17 ------------------------------------------------------------------------ On 2011-02-27T09:09:55+00:00 Daniel Richard G. wrote: aab, thanks for putting together this updated patch. I gave it a try, and whether due to the patch or another issue that I hadn't encountered before, it bombed out with this error: [...] # A.B.C.D SSH-2.0-dropbear_0.50 # W.X.Y.Z SSH-1.99-OpenSSH_3.9p1 # A.B.C.E SSH-2.0-dropbear_0.50 Connection closed by A.B.C.E conalloc: attempt to reuse fdno 47 make: *** [ssh_known_hosts.unx.new] Error 255 A couple of ancillary notes on the patch: 1. The old and new filenames both have the .orig extension! I had to edit one of each pair so that the patch could apply. 2. IMO, there isn't a need to add a new -L option... are "Connection closed" and e.g. "no 'blah' hostkey alg(s)" really categorically distinct to the end user? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/18 ------------------------------------------------------------------------ On 2011-02-27T18:16:05+00:00 Aab wrote: ># A.B.C.D SSH-2.0-dropbear_0.50 ># W.X.Y.Z SSH-1.99-OpenSSH_3.9p1 ># A.B.C.E SSH-2.0-dropbear_0.50 >Connection closed by A.B.C.E >conalloc: attempt to reuse fdno 47 >make: *** [ssh_known_hosts.unx.new] Error 255 Oh boy, I missed something. Is this repeatable? I think I saw this myself somewhere along the line but I thought I had fixed the problem. Since my time is pretty much taken up for the next week or so, I don't know when I'll be able to check. >1. The old and new filenames both have the .orig extension! I had to >edit one of each pair so that the patch could apply. I just looked at the attachment. There are two ".orig"s per file. One is on the `diff' statement and is ignored (I hope) by `patch'. The second is one line down on the "old" file identifier (---) and `patch' does use that. Which one was your `patch' making complaints about? >2. IMO, there isn't a need to add a new -L option... are "Connection >closed" and e.g. "no 'blah' hostkey alg(s)" really categorically >distinct to the end user? STDERR is extremely noisy as it is. In my case, at this time, I think I'd add on the order of 7000+ extra lines when I use '-L' that I'd need to winnow to find any important data. Besides, you can't forget that god called "upward compatibility" you know (;-}). And yes, if you meant "Connection timed out", I think that they are distinct at least from a Systems Administrator (me) point of view. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/19 ------------------------------------------------------------------------ On 2011-03-01T03:42:38+00:00 Daniel Richard G. wrote: (In reply to comment #17) > > Oh boy, I missed something. Is this repeatable? I think I saw this > myself somewhere along the line but I thought I had fixed the problem. > Since my time is pretty much taken up for the next week or so, I don't > know when I'll be able to check. Well, I tried it again, and it ran to completion. Must be a rare failure mode. > I just looked at the attachment. There are two ".orig"s per file. One > is on the `diff' statement and is ignored (I hope) by `patch'. The > second is one line down on the "old" file identifier (---) and `patch' > does use that. Which one was your `patch' making complaints about? Presumably the second one. It was looking for e.g. kex.c.orig rather than kex.c. > STDERR is extremely noisy as it is. In my case, at this time, I think > I'd add on the order of 7000+ extra lines when I use '-L' that I'd need > to winnow to find any important data. Besides, you can't forget that > god called "upward compatibility" you know (;-}). > > And yes, if you meant "Connection timed out", I think that they are > distinct at least from a Systems Administrator (me) point of view. *shrugs* I'd pretty much expect a flood of information anyway. Given a large network, you have to use grep(1) or the like to make any sense of it. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/20 ------------------------------------------------------------------------ On 2011-03-02T02:29:26+00:00 Aab wrote: Created attachment 2008 patch - fixes bug in previous patch >> Oh boy, I missed something. Is this repeatable? I think I saw this >> myself somewhere along the line but I thought I had fixed the problem. >> Since my time is pretty much taken up for the next week or so, I don't >> know when I'll be able to check. > >Well, I tried it again, and it ran to completion. Must be a rare >failure mode. Yep, I missed something. The sockets associated with ALL connections processed by the `keygrab_ssh2()' function are closed twice. I missed the close in the `packet.c:packet_close()' function that's called at the bottom of the `keygrab_ssh2()' function. I had assumed (bad bad word) that the only close was in the `confree()' function. Work/not work is up to the gods and the relative connection timings I think. >> I just looked at the attachment. There are two ".orig"s per file. One >> is on the `diff' statement and is ignored (I hope) by `patch'. The >> second is one line down on the "old" file identifier (---) and `patch' >> does use that. Which one was your `patch' making complaints about? > >Presumably the second one. It was looking for e.g. kex.c.orig rather >than kex.c. The format of this patch is the same as before. If you are using the current GNU `patch', you should be able to `patch [-p0] < patch' in the "openssh-5.8p1" parent directory. If your in the "openssh-5.8p1" directory itself, you should be able to `patch -p1 <patch'. >> STDERR is extremely noisy as it is. In my case, at this time, I think >> I'd add on the order of 7000+ extra lines when I use '-L' that I'd need >> to winnow to find any important data. Besides, you can't forget that >> god called "upward compatibility" you know (;-}). >> >> And yes, if you meant "Connection timed out", I think that they are >> distinct at least from a Systems Administrator (me) point of view. > >*shrugs* I'd pretty much expect a flood of information anyway. Given a >large network, you have to use grep(1) or the like to make any sense of >it. I think that, if/when this patch is actually submitted to the OpenSSH folks, I'll let the mavins there decide whether or not to have a '-L' option. To satisfy my curiosity, did you observe any missing hosts when you use the '-L' option (and it actually completes)? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/21 ------------------------------------------------------------------------ On 2011-03-02T08:23:42+00:00 Daniel Richard G. wrote: (In reply to comment #19) > > Yep, I missed something. The sockets associated with ALL connections > processed by the `keygrab_ssh2()' function are closed twice. I missed > the close in the `packet.c:packet_close()' function that's called at > the bottom of the `keygrab_ssh2()' function. I had assumed (bad bad > word) that the only close was in the `confree()' function. Work/not > work is up to the gods and the relative connection timings I think. I tried the new patch, and no errors. I'll give it a few more runs to see if anything breaks again. > The format of this patch is the same as before. If you are using the > current GNU `patch', you should be able to `patch [-p0] < patch' in the > "openssh-5.8p1" parent directory. If your in the "openssh-5.8p1" > directory itself, you should be able to `patch -p1 <patch'. Oh, I know about -p0 vs. -p1 and such. The problem is that the patch, as up currently, looks for foo.c.orig instead of foo.c. In other words, --- dir/foo.c.orig +++ dir/foo.c.orig (WRONG) --- dir/foo.c.orig +++ dir/foo.c (CORRECT) > I think that, if/when this patch is actually submitted to the OpenSSH > folks, I'll let the mavins there decide whether or not to have a '-L' > option. Fair enough, though I think there might be more value in just (unconditionally) printing a tally at the end of how many valid hosts were found, how many had no host algs, etc. (a bit like what "md5sum -c" does when it encounters errors). > To satisfy my curiosity, did you observe any missing hosts when you use > the '-L' option (and it actually completes)? Ah, I forgot to report on this; my bad! I do see a few hosts in the input list that are not mentioned anywhere in the stderr output. These appear to be strictly "alias" IP addresses, e.g. for an input line of 10.0.0.1,10.0.0.2,10.0.0.3 host.example.com,10.0.0.1,10.0.0.2,... ^^^^^^^^ ^^^^^^^^ these This is the correct behavior, I take it? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/22 ------------------------------------------------------------------------ On 2011-03-02T18:34:29+00:00 Aab wrote: (In reply to comment #20) > (In reply to comment #19) > >> The format of this patch is the same as before. If you are using the >> current GNU `patch', you should be able to `patch [-p0] < patch' in the >> "openssh-5.8p1" parent directory. If your in the "openssh-5.8p1" >> directory itself, you should be able to `patch -p1 <patch'. > >Oh, I know about -p0 vs. -p1 and such. The problem is that the patch, >as up currently, looks for foo.c.orig instead of foo.c. In other words, > > --- dir/foo.c.orig > +++ dir/foo.c.orig (WRONG) > > --- dir/foo.c.orig > +++ dir/foo.c (CORRECT) Hmmm, but the patch doesn't have two consecutive lines with ".orig" as you describe above. From observation, the first three lines for each modified file are similar to diff -u openssh-5.8p1/kex.c.orig openssh-5.8p1/kex.c --- openssh-5.8p1/kex.c.orig 2010-09-24 08:11:14.000000000 -0400 +++ openssh-5.8p1/kex.c 2011-02-11 18:14:03.396688000 -0500 Are you using the GNU patch? The attached patch text works for me with no changes whatsoever. Or to ask it somewhat differently, does your `patch' process WRONG even though the text is actually CORRECT? Is it possible that your`patch' is not ignoring the "diff" line? >> I think that, if/when this patch is actually submitted to the OpenSSH >> folks, I'll let the mavins there decide whether or not to have a '-L' >> option. > > Fair enough, though I think there might be more value in just > (unconditionally) printing a tally at the end of how many valid hosts > were found, how many had no host algs, etc. (a bit like what "md5sum > -c" does when it encounters errors). Actually, after I had sent the previous, I thought I should have added that the described approach is a cop out on my part (;-}). >> To satisfy my curiosity, did you observe any missing hosts when you use >> the '-L' option (and it actually completes)? > > Ah, I forgot to report on this; my bad! > > I do see a few hosts in the input list that are not mentioned anywhere > in the stderr output. These appear to be strictly "alias" IP addresses, > e.g. for an input line of > > 10.0.0.1,10.0.0.2,10.0.0.3 host.example.com,10.0.0.1,10.0.0.2,... > ^^^^^^^^ ^^^^^^^^ > these > > This is the correct behavior, I take it? I submit hosts, one per line, as the data to ssh-keyscan and am not familiar with the "alias" format. In fact, your comments clarified it somewhat for me. If you meant that "10.0.0.1" was seen in stderr and the others weren't, I believe that this is the "correct" behavior if ssh-keyscan had success with "10.0.0.1". I think the code tells me that it stops looking after the first IP/host with which it has success. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/23 ------------------------------------------------------------------------ On 2011-03-03T04:02:12+00:00 Daniel Richard G. wrote: (In reply to comment #21) > > Hmmm, but the patch doesn't have two consecutive lines with ".orig" as > you describe above. From observation, the first three lines for each > modified file are similar to > > diff -u openssh-5.8p1/kex.c.orig openssh-5.8p1/kex.c > --- openssh-5.8p1/kex.c.orig 2010-09-24 08:11:14.000000000 -0400 > +++ openssh-5.8p1/kex.c 2011-02-11 18:14:03.396688000 -0500 Um. Are we looking at the same file? Here are the first three lines of your most recent patch (attachment 2008, in comment #19): --- openssh-5.8p1/kex.c.orig 2010-09-24 08:11:14.000000000 -0400 +++ openssh-5.8p1/kex.c.orig 2011-02-11 18:14:03.396688000 -0500 @@ -49,6 +49,7 @@ > Are you using the GNU patch? The attached patch text works for me with > no changes whatsoever. Or to ask it somewhat differently, does your > `patch' process WRONG even though the text is actually CORRECT? Is it > possible that your`patch' is not ignoring the "diff" line? This is on an Ubuntu Linux system: host:/tmp/openssh-5.8p1$ patch -p1 --dry-run <aab-2008.patch patching file kex.c.orig Hunk #1 FAILED at 49. Hunk #2 FAILED at 367. 2 out of 2 hunks FAILED -- saving rejects to file kex.c.orig.rej patching file packet.c.orig Hunk #1 FAILED at 1025. Hunk #2 FAILED at 1035. Hunk #3 FAILED at 1100. 3 out of 3 hunks FAILED -- saving rejects to file packet.c.orig.rej [...] If I edit each "+++" line in the patch, it applies cleanly. > I submit hosts, one per line, as the data to ssh-keyscan and am not > familiar with the "alias" format. In fact, your comments clarified it > somewhat for me. If you meant that "10.0.0.1" was seen in stderr and > the others weren't, I believe that this is the "correct" behavior if > ssh-keyscan had success with "10.0.0.1". I think the code tells me > that it stops looking after the first IP/host with which it has > success. Okay, that seems reasonable. (Yes, I only saw 10.0.0.1 and not the other two.) The sample "Input format" line in the ssh-keyscan man page has two IP addresses in the first column, though the semantics of this are left unexplained. My assumption is that it's meant for hosts with round- robined DNS names, where the SSH server at each address uses the same host keys. (Which would be consistent with what you describe.) Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/24 ------------------------------------------------------------------------ On 2011-03-03T05:03:01+00:00 Aab wrote: (In reply to comment #22) > (In reply to comment #21) >> >> Hmmm, but the patch doesn't have two consecutive lines with ".orig" as >> you describe above. From observation, the first three lines for each >> modified file are similar to >> >> diff -u openssh-5.8p1/kex.c.orig openssh-5.8p1/kex.c >> --- openssh-5.8p1/kex.c.orig 2010-09-24 08:11:14.000000000 -0400 >> +++ openssh-5.8p1/kex.c 2011-02-11 18:14:03.396688000 -0500 > > Um. Are we looking at the same file? Here are the first three lines of > your most recent patch (attachment 2008 [details], in comment #19): > > --- openssh-5.8p1/kex.c.orig 2010-09-24 08:11:14.000000000 -0400 > +++ openssh-5.8p1/kex.c.orig 2011-02-11 18:14:03.396688000 -0500 > @@ -49,6 +49,7 @@ Boy, I'm not sure that we are looking at the same file. I just did a wget -Ojunk https://bugzilla.mindrot.org/attachment.cgi?id=2008 and got my version. When I click on the attachment line near the top of the bug #1213 comments (this page - "patch - fixes bug ..."), I get my version. Clicking on the "details" button that you specified above, I get my version. Have we encountered a bug in yet another utility? Browser problem? I should have thanked you earlier for "testing" the patch so I'll do so now - THANKS. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/25 ------------------------------------------------------------------------ On 2011-03-03T06:51:48+00:00 Daniel Richard G. wrote: Okay, I think I see what's going on here. When you click on the "attachment 2008" link, you're taken to a fancy side-by-side rendition of the diff. At the top, there are a series of links: View | Details | Raw Unified | Return to bug 1213 | Differences ... I was clicking on "Raw Unified," and got the broken patch. "View" goes to the URL you gave (which yields the correct patch). Confusing, isn't it? Anyway, I'm happy to test your patches, because that means I can get the company-wide ssh_known_hosts file I've been needing so much :-) Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/26 ------------------------------------------------------------------------ On 2011-03-04T01:07:11+00:00 Aab wrote: (In reply to comment #24) > Okay, I think I see what's going on here. > > When you click on the "attachment 2008 [details]" link, you're taken to a > fancy > side-by-side rendition of the diff. At the top, there are a series of > links: > > View | Details | Raw Unified | Return to bug 1213 | Differences ... > > I was clicking on "Raw Unified," and got the broken patch. "View" goes > to the URL you gave (which yields the correct patch). Confusing, isn't > it? Yes, it is indeed confusing. I've never used the exact path you used to get to the patch so I missed seeing the "bad" representation of it. One of the things that I've observed in generating the "ssh_known_hosts" file is that it can end up having a quite variable keyset as it depends on ALL of the hosts ALWAYS being up (don't we wish). It's probably overkill but we generate the "hosts" file once an hour via a set of wrapper scripts. Included within the scripts is a database that contains the current keys for all hosts that are currently supposed to be active (previously acquired via these same scripts). This allows us two capabilities: 1) if there is no key returned for some host, the database can supply the last one and 2) it allows us to see if there have been any changes in the keys that might signify a security break. A second part is a condensation of the keys via globbing. This assumes that a number of the hosts have the same key. The cluster nodes on our private networks are basically all cloned so we do get considerable condensation. Right now, for 4700+ hosts, the "hosts" file has 334 entries. The core script is a highly modified variant of the GNU licensed script, "make_ssh_known_hosts.pl", that was in "ssh-1.0.0" (circa 1998). Note that's "ssh" not "openssh". My original came from "ssh-1.2.26". For some reason, it disappeared when the OpenSSH folks took over. For Linux boxes, it's still dependent on my bind 9 hack of `nslookup' as I haven't had time to modify it to use the current GNU `host'. Would you be interested in anything like this? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/27 ------------------------------------------------------------------------ On 2011-03-05T08:08:28+00:00 Daniel Richard G. wrote: (In reply to comment #25) > > Yes, it is indeed confusing. I've never used the exact path you used > to get to the patch so I missed seeing the "bad" representation of it. Lord knows what the point of that link even is... I clicked on it only because "Raw" suggested that it would yield the "real" text/plain diff instead of a fancy HTML rendition. > Would you be interested in anything like this? I appreciate the offer, but a database would be overkill for my use case. I'm not in my company's IT department, and metamorphosing host keys on those 6000+ hosts are waaaay out of my purview. (I can't get too worked up over the security implications, either, since much worse than that is officially tolerated.) If anything, the most I would do is put together a Perl script to merge an old and new known_hosts file, such that new entries override old ones, and old ones that don't have a newer replacement are kept. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/28 ------------------------------------------------------------------------ On 2011-03-05T19:04:20+00:00 Paul Wouters wrote: (In reply to comment #26) > (In reply to comment #25) > If anything, the most I would do is put together a Perl script to merge > an old and new known_hosts file, such that new entries override old > ones, and old ones that don't have a newer replacement are kept. You really want to look at SSHFP DNS records protected by DNSSEC, and setting VerifyHostKeyDNS ask in your /etc/ssh/ssh_config you can use the "sshfp" tool for that, which is exactly why I was interested in this bug. sshfp can AXFR a zone, and use ssh-keyscan to connect to all A records in the zone and print the SSHFP record to add in your zones. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/29 ------------------------------------------------------------------------ On 2011-03-05T19:13:53+00:00 Daniel Richard G. wrote: (In reply to comment #27) > > You really want to look at SSHFP DNS records protected by DNSSEC, and > setting VerifyHostKeyDNS ask in your /etc/ssh/ssh_config I would, if I were in my company's IT department :-) (All I'm doing is generating an ssh_known_hosts file that is accessible to a handful of clients via a local fileserver. The network infrastructure beyond that is completely out of my hands.) > you can use the "sshfp" tool for that, which is exactly why I was > interested in this bug. sshfp can AXFR a zone, and use ssh-keyscan to > connect to all A records in the zone and print the SSHFP record to add > in your zones. Hmm, that could be useful. While I couldn't do much with the SSHFP records, the AXFR->keyscan functionality would be useful. (Right now, I'm doing the AXFR via host(1), and using a Perl script to reformat that into a hosts list for ssh-keyscan(1).) Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/30 ------------------------------------------------------------------------ On 2011-03-08T07:03:53+00:00 Aab wrote: Comment on attachment 1961 One attempt at getting the rsa key from a remote server that was having a number of problems. This has been resolved with attachment 2008. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/31 ------------------------------------------------------------------------ On 2011-03-15T01:14:37+00:00 Aab wrote: One of our `sshd' servers finally gave me sufficient problems to test the last of the patched code and, as far as I can tell, it worked. Is there anybody out there that has any issues with the current patch? If not, I wonder if I can catch the attention of any of the OpenSSH folks. I note that this problem has yet to be assigned to anyone. Or is there another route that I should take for attention? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/32 ------------------------------------------------------------------------ On 2011-03-18T06:36:09+00:00 Aab wrote: Created attachment 2016 Remove a bit of confusion from previous patch. I guess I'm the one that has an issue with the previous patch. The hostkey alg error message always references the "other end" of the socket. On the server the message reads as if the client was the one that didn't have the necessary hostkey algorithms. The updated patch has modified verbage for the server that attempts to distnguish the difference. I have a general issue with this anyhow. Wouldn't it be possible to check the server algorithms BEFORE asking the server to return a key that it doesn't have. If I read the code correctly, the debug2:kex_parse_init messages indicate that the code extracts the list of algorithms that the server supports from the SSH2_MSG_KEXINIT response. Isn't that before the request? Right now both the server and the client issue the same abort message and that seems a waste of time (and log file space (;-})). Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/33 ------------------------------------------------------------------------ On 2011-03-19T05:38:46+00:00 Aab wrote: Created attachment 2018 Add 'L' option to usage message Another small issue. I forgot to add the new '-L' option to the usage message. Also modified some of the comments for clarity. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/34 ------------------------------------------------------------------------ On 2011-03-26T00:38:20+00:00 Aab wrote: Created attachment 2021 Withdraw patch attachment #2018. This missive just obsoletes(withdraws) the current variant of the patch. We just had a bad network glitch here and, because of it, ssh-keyscan called the `select()' function in the `packet_read_seqnr()' function with a NULL timeout value. Since the read wasn't going to receive any data because of the glitch ever, it occasionally did one of those hang forever thingys. The patch still works if your network doesn't glitch like ours did albeit very crudely. It turns out that the original coders of ssh-keyscan missed(?) a call to the `packet_set_timeout()' function which in turn caused the above referenced NULL. I'm in the process of rewriting the patch to include a "set" call. FWIW - bugzilla won't let me subit this withour a non-null file. The new attachment is a NL. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/35 ------------------------------------------------------------------------ On 2011-06-14T04:43:42+00:00 Aab wrote: Created attachment 2057 Fix for previous patch variant. For all those waiting breathlessly (ha) for a correction to the ssh- keyscan patch I submitted earlier, here it is. I apologize for not getting it here sooner. This variant adds a call to the `packet_set_timeout()' function using the time value set or defaulted to on the command line by the '-T' option. The man page actually implies that this is the case but the code to implement it was never included. Part of the new code is a trap for the timeout condition and a resetting of the remaining active socket's timeout values to compensate for the time used waiting for the slow/braindead server that caused the timeout. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/36 ------------------------------------------------------------------------ On 2011-06-14T04:52:46+00:00 Aab wrote: Forgot to change the release to 5.8p2. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/37 ------------------------------------------------------------------------ On 2011-06-22T01:35:02+00:00 Aab wrote: Change component from "miscellaneous" to the new "ssh-keyscan". Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/38 ------------------------------------------------------------------------ On 2011-11-25T01:40:45+00:00 Daniel Richard G. wrote: Yet another failure mode... [...] # XXX.YYY.ZZ.8 SSH-2.0-Sun_SSH_1.1.3 # XXX.YYY.ZZ.9 SSH-2.0-OpenSSH_3.8.1p1 # XXX.YYY.ZZ.14 SSH-2.0-OpenSSH_4.3 # 10.10.1.35 SSH-2.0-RomSShell_4.62 Received disconnect from 10.10.1.35: 2: Protocol Timeout make: *** [ssh_known_hosts.unx.new] Error 255 This is with 5.8p1 still. aab@, I'll have to give your latest patch a try. Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/39 ------------------------------------------------------------------------ On 2011-11-25T07:19:27+00:00 Aab wrote: I haven't seen this one before. The text you included indicates that ssh-keyscan was processing a Protocol 2 key and it should be using the modified code to do it. Is there any way that you could send me a traceback when the failure occurs? FWIW - I think the " 2: Protocol Timeout" part of the message comes from the remote "SSH-2.0-RomSShell_4.62" server because I couldn't find that text in the OpenSSH source. What is "RomSShell"? Reply at: https://bugs.launchpad.net/openssh/+bug/483928/comments/40 ** Changed in: openssh Status: Unknown => Confirmed ** Changed in: openssh Importance: Unknown => High -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/483928 Title: ssh-keyscan(1) exits prematurely on some non-fatal errors To manage notifications about this bug go to: https://bugs.launchpad.net/openssh/+bug/483928/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs