On 04/10/2015 06:32 AM, Bob Copeland wrote:
<snip>
In the meantime, I'm hoping someone here can make heads or tails of
the hexdumps, and perhaps gain some insight. I confess I don't
really know what I'm looking at here, as I'm not clear how to
dissect these frames (if they even are frames.) Thanks in advance
for any insight you can provide. The rest of the dumps are included
below.
They all look like unicast mesh data frames, not mesh management
frames.  You can use something like:

     text2pcap -l 105 errors.txt errors.pcapng

...to make a wireshark pcap file to look at.  Which I did and attached
it to this email.

Thanks, that's a good trick to know.


I was expecting something like path error frames that mesh is
generating, data frames makes it a bit murkier.  If something in one of
the main data paths were broken I'd expect you to see it everywhere.

Are you doing multihop with the smaller network so that we know mesh
forwarding is generally working?

I have manually set mpaths so that a hop is forced, and forwarding seems to work fine if explicit routes are set. If I don't explicitly set routes on every node though, I get odd segments of packet loss. Even with only 5 nodes in a small space, so they can all get a good signal to each other, I still end up with odd packet loss problems.

For example, I manually set the mpath on A to A ->B->C. On C, I set C ->B->A If I did NOT also set "A->A" and "C->C" on node B, then a ping for some fair bit of time (5 minutes) from A->C ends up with around 25% packet loss. It comes in chunks too. 40+ packets in a row will return in under 5ms, then 10-20 packets will drop entirely. Then it will fix itself and do another large chunk at low latency. This is in an unencrypted mesh. Once I manually set A->A and C->C on node B, the packet loss went away.

In our larger mesh (33 nodes), where it's much more difficult to set reliable/sane mpaths manually, I'm seeing the same odd packet loss, even unencrypted. I haven't managed to get a reproducible test case yet, but it "feels" like the mpaths in the denser part of the mesh are changing often enough that they sometimes end up in loops, and that's the cause of the packet loss. That's just a vaguely intuitive guess at this point though. Once I can get a reproducible test case, I'll let everyone know.



You could try this workaround:

diff --git a/drivers/net/wireless/rt2x00/rt2x00dev.c 
b/drivers/net/wireless/rt2x00/rt2x00dev.c
index 5639ed8..260085e 100644
--- a/drivers/net/wireless/rt2x00/rt2x00dev.c
+++ b/drivers/net/wireless/rt2x00/rt2x00dev.c
@@ -1271,9 +1271,9 @@ static unsigned int rt2x00dev_extra_tx_headroom(struct 
rt2x00_dev *rt2x00dev)
                return 0;
if (rt2x00_is_usb(rt2x00dev))
-               return rt2x00dev->tx[0].winfo_size + rt2x00dev->tx[0].desc_size;
+               return rt2x00dev->tx[0].winfo_size + rt2x00dev->tx[0].desc_size 
+ 4;
- return rt2x00dev->tx[0].winfo_size;
+       return rt2x00dev->tx[0].winfo_size + 4;
  }
/*

Thank you for the suggestion, unfortunately, it doesn't appear to work. I applied that patch, and while it compiled fine, and gives no errors in dmesg, it appears to break the mesh. Unencrypted, nodes get stuck trying to open peer links, timeout, fail, and all nodes/stations end up in "LISTEN" mode only.


Thanks again for all the help and suggestions!

--James Otting
_______________________________________________
Devel mailing list
[email protected]
http://lists.open80211s.org/cgi-bin/mailman/listinfo/devel

Reply via email to