On Fri, 18 Sep 2020, Barry Smith wrote: > > > > On Sep 18, 2020, at 10:14 AM, Satish Balay <ba...@mcs.anl.gov> wrote: > > > > Its probably better to just run a test with gethostbyname()? > > I had hoped to avoid building C code and running it. The Apple manual page > for gethostbyname() states: The getaddrinfo(3) and getnameinfo(3) functions > are preferred over the gethostbyname(), gethostbyname2(), and > gethostbyaddr() functions. > > > I do not know what MPICH and OpenMPI use. > > On the Mac > > > > > The closest thing I can think off is: > > > > > > I don't know if 'traceroute' or 'host' commands are universally available. > > > >>>>>>> > > balay@sb /home/balay > > $ host `hostname` > > sb has address 192.168.0.144 > > $ host `hostname` > Host Barrys-MacBook-Pro-3.local not found: 3(NXDOMAIN) > > Also on the Apple `hostname` is associated with multiple addresses and it > seems different utilities may use different addresses produced. Some > addresses may work, others may not.
If its bound to multiple adresses nslookup should list all adressed If host doesn't work - how is tracroute able to resolve it? What do you get for: nslookup `hostname` traceroute `hostname` dig `hostname` Satish > > > I will make one more MR adding traceroute first and if any of the tests > succeed continue. If that fails for users then we will likely need to drop > the test. > > I don't like just using a mpiexec -n 2 test because that can fail for so > many reasons it is difficult to provide diagnostics to the users. > > Barry > > > > > balay@sb /home/balay > > $ echo $? > > 0 > > balay@sb /home/balay > > $ host foobar > > Host foobar not found: 3(NXDOMAIN) > > balay@sb /home/balay > > $ echo $? > > 1 > > balay@sb /home/balay > > $ > > <<<<<< > > > > However - I fear if there are *any* false positives - or false negatives - > > this test will generate more e-mail than the actual issue [of misbehaving > > MPI] > > > > Satish > > > > On Fri, 18 Sep 2020, Barry Smith wrote: > > > >> > >> try > >> > >> /usr/sbin/traceroute `hostname` > >> > >> > >>> On Sep 18, 2020, at 10:07 AM, Mark Adams <mfad...@lbl.gov> wrote: > >>> > >>> Let me know if you want anything else. > >>> Thanks, > >>> Mark > >>> > >>> On Fri, Sep 18, 2020 at 11:05 AM Mark Adams <mfad...@lbl.gov > >>> <mailto:mfad...@lbl.gov>> wrote: > >>> > >>> > >>> On Fri, Sep 18, 2020 at 11:04 AM Satish Balay <ba...@mcs.anl.gov > >>> <mailto:ba...@mcs.anl.gov>> wrote: > >>> On Fri, 18 Sep 2020, Satish Balay via petsc-users wrote: > >>> > >>>>>>> 07:41 master *= ~/Codes/petsc$ ping -c 2 MarksMac-302.local > >>>>>>> PING marksmac-302.local (127.0.0.1): 56 data bytes > >>>> > >>>> So it is resolving MarksMac-302.local as 127.0.0.1 - but ping is not > >>>> responding? > >>>> > >>>> I know some machines don't respond to external ping [and firewalls can > >>>> block it] but don't really know if they always respond to internal ping > >>>> or not. > >>>> > >>>> If some machines don't respond to internal ping - then we can't use > >>>> ping test in configure [it will create false negatives - as in this case] > >>> > >>> BTW: To confirm, please try: > >>> > >>> ping 127.0.0.1 > >>> > >>> > >>> 11:02 master *= ~/Codes/petsc$ sudo vi /etc/hosts > >>> 11:02 master *= ~/Codes/petsc$ ping 127.0.0.1 > >>> PING 127.0.0.1 (127.0.0.1): 56 data bytes > >>> Request timeout for icmp_seq 0 > >>> Request timeout for icmp_seq 1 > >>> Request timeout for icmp_seq 2 > >>> Request timeout for icmp_seq 3 > >>> Request timeout for icmp_seq 4 > >>> Request timeout for icmp_seq 5 > >>> Request timeout for icmp_seq 6 > >>> Request timeout for icmp_seq 7 > >>> Request timeout for icmp_seq 8 > >>> Request timeout for icmp_seq 9 > >>> Request timeout for icmp_seq 10 > >>> Request timeout for icmp_seq 11 > >>> Request timeout for icmp_seq 12 > >>> Request timeout for icmp_seq 13 > >>> Request timeout for icmp_seq 14 > >>> Request timeout for icmp_seq 15 > >>> Request timeout for icmp_seq 16 > >>> Request timeout for icmp_seq 17 > >>> Request timeout for icmp_seq 18 > >>> Request timeout for icmp_seq 19 > >>> Request timeout for icmp_seq 20 > >>> Request timeout for icmp_seq 21 > >>> > >>> still going ...... > >>> > >>> > >>> Satish > >>> > >>>> > >>>> > >>>> Mark, can you remove the line that you added to /etc/hosts - i.e: > >>>> > >>>> 127.0.0.1 MarksMac-302.local > >>>> > >>>> And now rerun MPI tests. Do they work or fail? > >>>> > >>>> [this is to check if this test is a false positive on your machine] > >>>> > >>>> Satish > >>>> > >>>> > >>>> On Fri, 18 Sep 2020, Mark Adams wrote: > >>>> > >>>>> On Fri, Sep 18, 2020 at 7:51 AM Matthew Knepley <knep...@gmail.com > >>>>> <mailto:knep...@gmail.com>> wrote: > >>>>> > >>>>>> On Fri, Sep 18, 2020 at 7:46 AM Mark Adams <mfad...@lbl.gov > >>>>>> <mailto:mfad...@lbl.gov>> wrote: > >>>>>> > >>>>>>> Oh you did not change my hostname: > >>>>>>> > >>>>>>> 07:37 master *= ~/Codes/petsc$ hostname > >>>>>>> MarksMac-302.local > >>>>>>> 07:41 master *= ~/Codes/petsc$ ping -c 2 MarksMac-302.local > >>>>>>> PING marksmac-302.local (127.0.0.1): 56 data bytes > >>>>>>> Request timeout for icmp_seq 0 > >>>>>>> > >>>>>>> --- marksmac-302.local ping statistics --- > >>>>>>> 2 packets transmitted, 0 packets received, 100.0% packet loss > >>>>>>> 07:42 2 master *= ~/Codes/petsc$ > >>>>>>> > >>>>>> > >>>>>> This does not make sense to me. You have > >>>>>> > >>>>>> 127.0.0.1 MarksMac-302.local > >>>>>> > >>>>>> in /etc/hosts, > >>>>>> > >>>>> > >>>>> 09:07 ~/.ssh$ cat /etc/hosts > >>>>> ## > >>>>> # Host Database > >>>>> # > >>>>> # localhost is used to configure the loopback interface > >>>>> # when the system is booting. Do not change this entry. > >>>>> ## > >>>>> 127.0.0.1 localhost > >>>>> 255.255.255.255 broadcasthost > >>>>> 127.0.0.1 MarksMac-5.local > >>>>> 127.0.0.1 243.124.240.10.in-addr.arpa.private.cam.ac.uk > >>>>> <http://243.124.240.10.in-addr.arpa.private.cam.ac.uk/> > >>>>> 127.0.0.1 MarksMac-302.local > >>>>> 09:07 ~/.ssh$ > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>> but you cannot resolve that name? > >>>>>> > >>>>>> Matt > >>>>>> > >>>>>> > >>>>>>> BTW, I used to get messages about some network issue and 'changing > >>>>>>> host > >>>>>>> name to MarksMac-[x+1].local'. That is, the original hostname > >>>>>>> was MarksMac.local, then I got a message about changing > >>>>>>> to MarksMac-1.local, etc. I have not seen these messages for months > >>>>>>> but > >>>>>>> apparently this process has continued unabated. > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> On Thu, Sep 17, 2020 at 11:10 PM Satish Balay via petsc-users < > >>>>>>> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote: > >>>>>>> > >>>>>>>> On Thu, 17 Sep 2020, Matthew Knepley wrote: > >>>>>>>> > >>>>>>>>> On Thu, Sep 17, 2020 at 8:33 PM Barry Smith <bsm...@petsc.dev > >>>>>>>>> <mailto:bsm...@petsc.dev>> wrote: > >>>>>>>>> > >>>>>>>>>>> On Sep 17, 2020, at 4:59 PM, Satish Balay via petsc-users < > >>>>>>>>>> petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> Here is a fix: > >>>>>>>>>>> > >>>>>>>>>>> echo 127.0.0.1 `hostname` | sudo tee -a /etc/hosts > >>>>>>>>>> > >>>>>>>>>> Satish, > >>>>>>>>>> > >>>>>>>>>> I don't think you want to be doing this on a Mac (on anything?) > >>>>>>>> On a > >>>>>>>>>> Mac based on the network configuration etc as it boots up and as > >>>>>>>> networks > >>>>>>>>>> are accessible or not (wi-fi) it determines what hostname should > >>>>>>>>>> be, > >>>>>>>> one > >>>>>>>>>> should never being hardwiring it to some value. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> Satish is just naming the loopback interface. I did this on all my > >>>>>>>> former > >>>>>>>>> Macs. > >>>>>>>> > >>>>>>>> > >>>>>>>> Yes - this doesn't change the hostname. Its just adding an entry for > >>>>>>>> gethostbyname - for current hostname. > >>>>>>>> > >>>>>>>>>>> > >>>>>>>> 127.0.0.1 MarksMac-302.local > >>>>>>>> <<< > >>>>>>>> > >>>>>>>> Sure - its best to not do this when one has a proper IP name [like > >>>>>>>> foo.mcs.anl.gov <http://foo.mcs.anl.gov/>] - but its useful when one > >>>>>>>> has a hostname like > >>>>>>>> "MarksMac-302.local" -that is not DNS resolvable > >>>>>>>> > >>>>>>>> Even if the machine is moved to a different network with a different > >>>>>>>> name - the current entry won't cause problems [but will need another > >>>>>>>> entry > >>>>>>>> for the new host name - if this new name is also not DNS resolvable] > >>>>>>>> > >>>>>>>> Its likely this file is a generated file on macos - so might get > >>>>>>>> reset > >>>>>>>> on reboot - or some network change? [if this is the case - the > >>>>>>>> change won't > >>>>>>>> be permanent] > >>>>>>>> > >>>>>>>> > >>>>>>>> Satish > >>>>>>>> > >>>>>>> > >>>>>> > >>>>>> -- > >>>>>> What most experimenters take for granted before they begin their > >>>>>> experiments is infinitely more interesting than any results to which > >>>>>> their > >>>>>> experiments lead. > >>>>>> -- Norbert Wiener > >>>>>> > >>>>>> https://www.cse.buffalo.edu/~knepley/ > >>>>>> <https://www.cse.buffalo.edu/~knepley/> > >>>>>> <http://www.cse.buffalo.edu/~knepley/ > >>>>>> <http://www.cse.buffalo.edu/~knepley/>> > >>>>>> > >>>>> > >>>> > >>> > >> > >> > > > >