Public bug reported:

Hi,
the reproducibility of this is hard, but I've seen it twice in a week now so it 
is not a stray cosmic ray after all.

I've seen nmap hang that was running with a timeout already which made
me wonder.

$ nmap -Pn 192.168.122.0/24 --host-timeout 10

In PS the output looks normal usually in a wchan with a timeout:

$ ps axlf
...
4     0   10155       0  20   0  25372 17320 poll_s S    ?          0:07 nmap 
-Pn 192.168.122.0/24 --host-timeout 10

Full wchan:
cat /proc/10155/wchan 
poll_schedule_timeout.constprop.0

I found that nmap isn't dead, it is looping it seems.
Obviously on GDB you always find it at the timeout

0x00007fe5b73a725a in __GI___select (nfds=nfds@entry=5, 
readfds=readfds@entry=0x7ffec6db1540, writefds=writefds@entry=0x0, 
exceptfds=exceptfds@entry=0x0, 
    timeout=timeout@entry=0x7ffec6db1670) at 
../sysdeps/unix/sysv/linux/select.c:41
41      ../sysdeps/unix/sysv/linux/select.c: No such file or directory.
(gdb) bt
#0  0x00007fe5b73a725a in __GI___select (nfds=nfds@entry=5, 
readfds=readfds@entry=0x7ffec6db1540, writefds=writefds@entry=0x0, 
exceptfds=exceptfds@entry=0x0, 
    timeout=timeout@entry=0x7ffec6db1670) at 
../sysdeps/unix/sysv/linux/select.c:41
#1  0x000055900ff4f838 in pcap_select (p=p@entry=0x55901131ec10, 
timeout=timeout@entry=0x7ffec6db1670) at netutil.cc:1012
#2  0x000055900ff4fa15 in pcap_select (usecs=999993, p=0x55901131ec10) at 
netutil.cc:1031
#3  read_reply_pcap (pd=0x55901131ec10, to_usec=999993, 
accept_callback=accept_callback@entry=0x55900ff4e530 <accept_arp(unsigned char 
const*, pcap_pkthdr const*, int, size_t)>, 
    p=p@entry=0x7ffec6db16f0, head=head@entry=0x7ffec6db1700, 
rcvdtime=rcvdtime@entry=0x7ffec6db1770, datalink=0x7ffec6db16ec, 
offset=0x7ffec6db16f8) at netutil.cc:4234
#4  0x000055900ff56014 in read_arp_reply_pcap (pd=<optimized out>, 
sendermac=sendermac@entry=0x7ffec6db1792 "", 
senderIP=senderIP@entry=0x7ffec6db176c, to_usec=<optimized out>, 
    rcvdtime=rcvdtime@entry=0x7ffec6db1770, trace_callback=0x55900ff0ef70 
<PacketTrace::traceArp(int, unsigned char const*, unsigned int, timeval*)>) at 
netutil.cc:4324
#5  0x000055900fefce80 in get_arp_result (USI=0x7ffec6db1890, 
stime=0x7ffec6db1860) at scan_engine_raw.cc:1557
#6  0x000055900fef5e67 in ultra_scan(std::vector<Target*, 
std::allocator<Target*> >&, scan_lists*, stype, timeout_info*) () at 
scan_engine.cc:2546
#7  0x000055900ff0e1cc in arpping (hostbatch=<optimized out>, num_hosts=255) at 
targets.cc:187
#8  0x000055900ff0ed00 in refresh_hostbatch (hs=<optimized out>, 
exclude_group=0x559010b87780, ports=0x559010142980 <ports>, pingtype=1) at 
targets.cc:591
#9  0x000055900ff0ee5d in nexthost (hs=hs@entry=0x7ffec6db2010, 
exclude_group=exclude_group@entry=0x559010b87780, 
ports=ports@entry=0x559010142980 <ports>, pingtype=<optimized out>)
    at targets.cc:644
#10 0x000055900fec8ec3 in nmap_main (argc=<optimized out>, argv=0x7ffec6db3058) 
at nmap.cc:2063
#11 0x000055900fe9e215 in main (argc=5, argv=0x7ffec6db3058) at main.cc:237


But strace has shown that the syscall is bad
# strace -rT -p 10155
strace: Process 10155 attached
     0.000000 select(5, [4], NULL, NULL, {tv_sec=0, tv_usec=438201}) = 0 
(Timeout) <0.438773>
     0.439032 ioctl(-1, TIOCGPGRP, 0x7ffec6db1784) = -1 EBADF (Bad file 
descriptor) <0.000028>
     0.000170 getpgrp()                 = 0 <0.000021>
     0.000092 select(5, [4], NULL, NULL, {tv_sec=0, tv_usec=999992}) = 0 
(Timeout) <1.001123>
     1.001253 ioctl(-1, TIOCGPGRP, 0x7ffec6db1784) = -1 EBADF (Bad file 
descriptor) <0.000019>
     0.000075 getpgrp()                 = 0 <0.000018>


I expected nmap to give up if the select call went bad - and indeed the code 
would:
1008      FD_SET(fd, &rfds);
1009
1010      do {
1011        errno = 0;
1012        ret = select(fd + 1, &rfds, NULL, NULL, timeout);
1013        if (ret == -1) {
1014          if (errno == EINTR)
1015            netutil_error("%s: %s", __func__, strerror(errno));
1016          else
1017            netutil_fatal("Your system does not support select()ing on pcap 
devices (%s). PLEASE REPORT THIS ALONG WITH DETAILED SYSTEM INFORMATION TO THE 
nmap-dev MAILING LIST!", strerror(errno));

But glibc returns a good RC on this ioctl error:
1012        ret = select(fd + 1, &rfds, NULL, NULL, timeout);
(gdb) n
1013        if (ret == -1) {
(gdb) p ret
$1 = 1


There isn't much more code in glbic shown in GDB
Breakpoint 1, pcap_select (p=p@entry=0x55901131ec10, 
timeout=timeout@entry=0x7ffec6db1670) at netutil.cc:976
976     in netutil.cc
(gdb) c
Continuing.

Breakpoint 2, __GI___select (nfds=nfds@entry=5, 
readfds=readfds@entry=0x7ffec6db1540, writefds=writefds@entry=0x0, 
exceptfds=exceptfds@entry=0x0, timeout=timeout@entry=0x7ffec6db1670)
    at ../sysdeps/unix/sysv/linux/select.c:39
39      {
(gdb) n
41        return SYSCALL_CANCEL (select, nfds, readfds, writefds, exceptfds,
(gdb) s


Is it a bug in glibc, would it need to return -1 in that case?

Note: it also is a bug in nmap to loop on it but they could say "but the
answer fromt he socket call is wrong", so lets sort that out first.

** Affects: glibc (Ubuntu)
     Importance: Undecided
         Status: New

** Affects: nmap (Ubuntu)
     Importance: Undecided
         Status: Incomplete

** Also affects: glibc (Ubuntu)
   Importance: Undecided
       Status: New

** Changed in: nmap (Ubuntu)
       Status: New => Incomplete

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1861389

Title:
  nmap hang due to BADF ioctl inside select call is returning good rc

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1861389/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to