I am having a very strange problem, which I am having a very difficult time diagnosing. I recently migrated my PPP server (5 dialin lines) from an old Pentium 100 to not quite so old dual PII.
I certainly won't say the two setups are "identical", but they are similar---I pretty much just copied /etc/ppp/options from one machine to the other. I am using mgetty to answer the phone, and then a_ppp to start ppp, etc. This all works fine, the lines are answered, users are authenticated, ppp is started. The problem occurs after ppp is running. Everything seems to work just fine for a the first 10-1000k or so, but then data stops flowing. ping times are about 170ms, with no packet loss, and then they go to 170ms with 50% packet loss. If this were a phone line problem, I would expect the ping times to go up as the modems retransmit data (error correction) but to have no packet loss. A tcpdump on the interface shows, for example: (10.0.0.110 is connected by ppp, 10.0.0.97 is a machine that is on the same subnet as the ppp server) 11:54:19.654643 10.0.0.110 > 10.0.0.97: icmp: echo request 11:54:20.314760 10.0.0.97 > 10.0.0.110: icmp: echo reply 11:54:21.004675 10.0.0.110 > 10.0.0.97: icmp: echo request 11:54:21.004852 10.0.0.97 > 10.0.0.110: icmp: echo reply 11:54:22.004701 10.0.0.110 > 10.0.0.97: icmp: echo request 11:54:22.004923 10.0.0.97 > 10.0.0.110: icmp: echo reply 11:54:23.004728 10.0.0.110 > 10.0.0.97: icmp: echo request 11:54:23.004949 10.0.0.97 > 10.0.0.110: icmp: echo reply looks fine, but only two of the four echo replies made it across the ppp connection. A ping going the other direction looks bad too: 11:56:22.990528 10.0.0.97 > 10.0.0.110: icmp: echo request (DF) 11:56:23.187445 10.0.0.110 > 10.0.0.97: icmp: echo reply (DF) 11:56:23.986612 10.0.0.97 > 10.0.0.110: icmp: echo request (DF) 11:56:24.167481 10.0.0.110 > 10.0.0.97: icmp: echo reply (DF) 11:56:24.986000 10.0.0.97 > 10.0.0.110: icmp: echo request (DF) 11:56:25.985988 10.0.0.97 > 10.0.0.110: icmp: echo request (DF) as can be seen, that is 50% packet loss. I get the same pattern if I ping between the ppp client and server, instead of a machine on the ppp server's subnet. It is as if the ppp server gets bored with sending packets out the ppp interface after a time. It doesn't seem to be a routing issue (at least initially), because everything works perfectly for the first few hundred kb. I can't figure out what might be changing after the connection is up that could cause this problem. The connections don't time out from lcp-echo-failures, so it seems that the ppp layer is intact. It really seems like something in the kernel decides to stop moving packets. I don't think it is the ethernet driver on the ppp server (tulip) as packets that originate (or are destined) for the ppp server show the same behavior when crossing the ppp link. I have disabled all firewalling rules and iptables, just to make sure that isn't screwing me up. This occurs with kernel 2.4.13-ac6 and 2.4.14, as well as ppp 2.4.0f and 2.4.1. Here are the options in effect: Nov 6 11:46:35 linux pppd[5157]: pppd options in effect: Nov 6 11:46:35 linux pppd[5157]: debug^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: kdebug 1^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: ktune^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: dump^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: nomultilink^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: +pap^I^I# (from command line) Nov 6 11:46:35 linux pppd[5157]: -chap^I^I# (from command line) Nov 6 11:46:35 linux pppd[5157]: login^I^I# (from command line) Nov 6 11:46:35 linux pppd[5157]: lock^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: crtscts^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: modem^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: asyncmap 0^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: lcp-echo-failure 4^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: lcp-echo-interval 30^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: hide-password^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: ipcp-accept-remote^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: ms-dns xxx # [don't know how to print value]^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: ms-wins xxx # [don't know how to print value]^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: proxyarp^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: netmask 255.255.255.0^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: 10.0.0.69:10.0.0.110^I^I# (from /etc/ppp/options.ttyR0) Nov 6 11:46:35 linux pppd[5157]: bsdcomp 15^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: deflate 15^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: noipx^I^I# (from /etc/ppp/options) Nov 6 11:46:35 linux pppd[5157]: pppd 2.4.1 started by a_ppp, uid 0 Any advice would be greatly appreciated. I apologize if you see this message multiple times, but I am sending it to several different lists, as I am totally at a loss on how to proceed. Hopefully the solution is some brain fart on my case, such as echo 1 > /proc/sys/net/ipv4/ppp_should_work -- Jeff Lessem.