On 04/20/2017 05:04 AM, Tim Cutts wrote:
I've seen, in the past, problems with fragmented packets being misinterpreted,
resulting in stale NFS symptoms. In that case it was an Intel STL motherboard
(we're talking 20 years ago here), which shared a NIC for management as well as
the main interf
Thanks for the tip. I hadn't even thought of looking at SMART, although
any errors should show up in the logwatch e-mails, which I do check
every day, and haven't seen any on these systems. I also heard recently
that the smartmontools that come with most Linux distros are horribly
old, and the
On 04/19/2017 05:52 PM, Bernd Schubert wrote:
On 04/19/2017 07:58 PM, Prentice Bisbal wrote:
Here's the sequence of events:
1. First job(s) run fine on the node and complete without error.
2. Eventually a job fails with a 'permission denied' error when it tries
to access /l/hostname.
So you
On 04/19/2017 03:21 PM, Jörg Saßmannshausen wrote:
Hi Prentice,
three questions (not necessarily to you and it can be dealt with in a different
thread too):
- why automount and not a static mount?
Well, I've been told that, in general, automounting reduces the load(s)
on the servers, since th
+1 for looking at the MTUs. I just finished debugging what was manifesting as
transient NFS problems of various types but turned-out to be MTU mis-matches.
charlie
> On Apr 20, 2017, at 09:51, Gavin W. Burris wrote:
>
> Remembering that I once had two switches that were not allowing jumbo fram
Remembering that I once had two switches that were not allowing jumbo frames
over a crossover link. Similar if not the same symptoms.
Cheers.
On Thu 04/20/17 09:17AM EDT, Gavin W. Burris wrote:
> Hi, Prentice.
>
> Have you checked MTU matches on all NICs and is honored by the router?
>
> Chee
Hi, Prentice.
Have you checked MTU matches on all NICs and is honored by the router?
Cheers.
On Wed 04/19/17 02:34PM EDT, Prentice Bisbal wrote:
>
> On 04/19/2017 02:17 PM, Ellis H. Wilson III wrote:
> >On 04/19/2017 02:11 PM, Prentice Bisbal wrote:
> >>Thanks for the suggestion(s). Just this m
The value fortcp_slot_table_entries seemed very low to me on our system.
However, reading up on it the value is autotuned
https://researcher.watson.ibm.com/researcher/view_person_subpage.php?id=4427
sunrpc.tcp_max_slot_table_entries = 65536
sunrpc.tcp_slot_table_entries = 2
Prentice, it wouldn't
Tim
That reminds me of the issue I found with shared IPMI interfaces - the reserved
IPMI port clashing with the sunrpc.min_resvport (or more exactly the range of
Sun RPC ports overlapping with IPMI)
That was a long time ago, and the min_resvport has been increased in modern
kernels as far as I
On Wed, Apr 19, 2017 at 8:34 PM, Prentice Bisbal wrote:
> My setup isn't nearly that complicated. Every node in this cluster has a
> /local directory that is shared out to the other nodes in the cluster. The
> other nodes automount this by remote directory as /l/hostname, where
> "hostname" is the
I've seen, in the past, problems with fragmented packets being misinterpreted,
resulting in stale NFS symptoms. In that case it was an Intel STL motherboard
(we're talking 20 years ago here), which shared a NIC for management as well as
the main interface. The fragmented packets got inappropria
11 matches
Mail list logo