On Friday 14 December 2007 14:51, I wrote:
> On Friday 14 December 2007 07:39, Peter Zijlstra wrote:
> Note that false sharing of slab pages is still possible between two
> unrelated writeout processes, both of which obey rules for their own
> writeout path, but the pinned combination does not. Th
On Friday 14 December 2007 07:39, Peter Zijlstra wrote:
> Restrict objects from reserve slabs (ALLOC_NO_WATERMARKS) to
> allocation contexts that are entitled to it. This is done to ensure
> reserve pages don't leak out and get consumed.
Tighter definitions of "leak out" and "get consumed" would b
On Friday 14 December 2007 07:39, Peter Zijlstra wrote:
> Provide a method to get the upper bound on the pages needed to
> allocate a given number of objects from a given kmem_cache.
>
> This lays the foundation for a generic reserve framework as presented
> in a later patch in this series. This fr
Hi Peter,
sysctl_intvec_fragment, proc_dointvec_fragment, sysctl_intvec_fragment
seem to suffer from cut-n-pastitis.
Regards,
Daniel
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/
Hi Peter,
A major feature of this patch set is the network receive deadlock
avoidance, but there is quite a bit of stuff bundled with it, the NFS
user accounting for a big part of the patch by itself.
Is it possible to provide a before and after demonstration case for just
the network receive
On Friday 31 August 2007 14:41, Alasdair G Kergon wrote:
> On Thu, Aug 30, 2007 at 04:20:35PM -0700, Daniel Phillips wrote:
> > Resubmitting a bio or submitting a dependent bio from
> > inside a block driver does not need to be throttled because all
> > resources required to
On Wednesday 29 August 2007 01:53, Evgeniy Polyakov wrote:
> Then, if of course you will want, which I doubt, you can reread
> previous mails and find that it was pointed to that race and
> possibilities to solve it way too long ago.
What still bothers me about your response is that, while you kno
On Tuesday 28 August 2007 10:54, Evgeniy Polyakov wrote:
> On Tue, Aug 28, 2007 at 10:27:59AM -0700, Daniel Phillips ([EMAIL PROTECTED])
> wrote:
> > > We do not care about one cpu being able to increase its counter
> > > higher than the limit, such inaccuracy (maximum b
On Tuesday 28 August 2007 02:35, Evgeniy Polyakov wrote:
> On Mon, Aug 27, 2007 at 02:57:37PM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > Say Evgeniy, something I was curious about but forgot to ask you
> > earlier...
> >
> > On Wednesday 08 August 2007 03
Say Evgeniy, something I was curious about but forgot to ask you
earlier...
On Wednesday 08 August 2007 03:17, Evgeniy Polyakov wrote:
> ...All oerations are not atomic, since we do not care about precise
> number of bios, but a fact, that we are close or close enough to the
> limit.
> ... in bi
On Tuesday 14 August 2007 05:46, Evgeniy Polyakov wrote:
> > The throttling of the virtual device must begin in
> > generic_make_request and last to ->endio. You release the throttle
> > of the virtual device at the point you remap the bio to an
> > underlying device, which you have convinced your
On Tuesday 14 August 2007 04:50, Evgeniy Polyakov wrote:
> On Tue, Aug 14, 2007 at 04:35:43AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote:
> > > > And it will not solve the deadlock problem in general. (Maybe
On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote:
> > And it will not solve the deadlock problem in general. (Maybe it
> > works for your virtual device, but I wonder...) If the virtual
> > device allocates memory during generic_make_request then the memory
> > needs to be throttled.
>
> D
On Tuesday 14 August 2007 01:46, Evgeniy Polyakov wrote:
> On Mon, Aug 13, 2007 at 06:04:06AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > Perhaps you never worried about the resources that the device
> > mapper mapping function allocates to handle each bio and so did n
On Monday 13 August 2007 02:12, Jens Axboe wrote:
> > It is a system wide problem. Every block device needs throttling,
> > otherwise queues expand without limit. Currently, block devices
> > that use the standard request library get a slipshod form of
> > throttling for free in the form of limit
On Monday 13 August 2007 05:18, Evgeniy Polyakov wrote:
> > Say you have a device mapper device with some physical device
> > sitting underneath, the classic use case for this throttle code.
> > Say 8,000 threads each submit an IO in parallel. The device mapper
> > mapping function will be called
On Monday 13 August 2007 05:04, Evgeniy Polyakov wrote:
> On Mon, Aug 13, 2007 at 04:04:26AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote:
> > > > Oops, and there is also:
> > > >
> > > &
On Monday 13 August 2007 04:03, Evgeniy Polyakov wrote:
> On Mon, Aug 13, 2007 at 03:12:33AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > > This is not a very good solution, since it requires all users of
> > > the bios to know how to free it.
> >
> > No
On Monday 13 August 2007 01:23, Evgeniy Polyakov wrote:
> On Sun, Aug 12, 2007 at 10:36:23PM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > (previous incomplete message sent accidentally)
> >
> > On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote:
> > >
On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote:
> > Oops, and there is also:
> >
> > 3) The bio throttle, which is supposed to prevent deadlock, can
> > itself deadlock. Let me see if I can remember how it goes.
> >
> > * generic_make_request puts a bio in flight
> > * the bio gets pas
On Monday 13 August 2007 03:22, Jens Axboe wrote:
> I never compared the bio to struct page, I'd obviously agree that
> shrinking struct page was a worthy goal and that it'd be ok to uglify
> some code to do that. The same isn't true for struct bio.
I thought I just said that.
Regards,
Daniel
-
On Monday 13 August 2007 03:06, Jens Axboe wrote:
> On Mon, Aug 13 2007, Daniel Phillips wrote:
> > Of course not. Nothing I said stops endio from being called in the
> > usual way as well. For this to work, endio just needs to know that
> > one call means "end&quo
On Monday 13 August 2007 02:18, Evgeniy Polyakov wrote:
> On Mon, Aug 13, 2007 at 02:08:57AM -0700, Daniel Phillips
([EMAIL PROTECTED]) wrote:
> > > But that idea fails as well, since reference counts and IO
> > > completion are two completely seperate entities. So unl
On Monday 13 August 2007 02:13, Jens Axboe wrote:
> On Mon, Aug 13 2007, Daniel Phillips wrote:
> > On Monday 13 August 2007 00:45, Jens Axboe wrote:
> > > On Mon, Aug 13 2007, Jens Axboe wrote:
> > > > > You did not comment on the one about putting the bio
&g
On Monday 13 August 2007 00:45, Jens Axboe wrote:
> On Mon, Aug 13 2007, Jens Axboe wrote:
> > > You did not comment on the one about putting the bio destructor
> > > in the ->endio handler, which looks dead simple. The majority of
> > > cases just use the default endio handler and the default
> >
On Monday 13 August 2007 00:28, Jens Axboe wrote:
> On Sun, Aug 12 2007, Daniel Phillips wrote:
> > Right, that is done by bi_vcnt. I meant bi_max_vecs, which you can
> > derive efficiently from BIO_POOL_IDX() provided the bio was
> > allocated in the standard way.
>
>
On Sunday 12 August 2007 22:36, I wrote:
> Note! There are two more issues I forgot to mention earlier.
Oops, and there is also:
3) The bio throttle, which is supposed to prevent deadlock, can itself
deadlock. Let me see if I can remember how it goes.
* generic_make_request puts a bio in fl
(previous incomplete message sent accidentally)
On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote:
> On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe wrote:
>
> So, what did we decide? To bloat bio a bit (add a queue pointer) or
> to use physical device limits? The latter requires to r
On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote:
> On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe
([EMAIL PROTECTED]) wrote:
>
> So, what did we decide? To bloat bio a bit (add a queue pointer) or
> to use physical device limits? The latter requires to replace all
> occurence of bi
On Tuesday 07 August 2007 13:55, Jens Axboe wrote:
> I don't like structure bloat, but I do like nice design. Overloading
> is a necessary evil sometimes, though. Even today, there isn't enough
> room to hold bi_rw and bi_flags in the same variable on 32-bit archs,
> so that concern can be scratche
Hi Evgeniy,
Sorry for not getting back to you right away, I was on the road with
limited email access. Incidentally, the reason my mails to you keep
bouncing is, your MTA is picky about my mailer's IP reversing to a real
hostname. I will take care of that pretty soon, but for now my direct
m
On Tuesday 07 August 2007 05:05, Jens Axboe wrote:
> On Sun, Aug 05 2007, Daniel Phillips wrote:
> > A simple way to solve the stable accounting field issue is to add a
> > new pointer to struct bio that is owned by the top level submitter
> > (normally generic_make_request b
On Sunday 05 August 2007 08:01, Evgeniy Polyakov wrote:
> On Sun, Aug 05, 2007 at 01:06:58AM -0700, Daniel Phillips wrote:
> > > DST original code worked as device mapper plugin too, but its two
> > > additional allocations (io and clone) per block request ended up
> >
On Sunday 05 August 2007 08:08, Evgeniy Polyakov wrote:
> If we are sleeping in memory pool, then we already do not have memory
> to complete previous requests, so we are in trouble.
Not at all. Any requests in flight are guaranteed to get the resources
they need to complete. This is guaranteed
On Saturday 04 August 2007 09:44, Evgeniy Polyakov wrote:
> > On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote:
> > > * storage can be formed on top of remote nodes and be
> > > exported simultaneously (iSCSI is peer-to-peer only, NBD requires
> > > device mapper and is synchronous)
> >
>
On Saturday 04 August 2007 09:37, Evgeniy Polyakov wrote:
> On Fri, Aug 03, 2007 at 06:19:16PM -0700, I wrote:
> > To be sure, I am not very proud of this throttling mechanism for
> > various reasons, but the thing is, _any_ throttling mechanism no
> > matter how sucky solves the deadlock problem.
On Friday 03 August 2007 03:26, Evgeniy Polyakov wrote:
> On Thu, Aug 02, 2007 at 02:08:24PM -0700, I wrote:
> > I see bits that worry me, e.g.:
> >
> > + req = mempool_alloc(st->w->req_pool, GFP_NOIO);
> >
> > which seems to be callable in response to a local request, just the
> > case w
Hi Mike,
On Thursday 02 August 2007 21:09, Mike Snitzer wrote:
> But NBD's synchronous nature is actually an asset when coupled with
> MD raid1 as it provides guarantees that the data has _really_ been
> mirrored remotely.
And bio completion doesn't?
Regards,
Daniel
-
To unsubscribe from this l
Hi Evgeniy,
Nit alert:
On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote:
> * storage can be formed on top of remote nodes and be exported
> simultaneously (iSCSI is peer-to-peer only, NBD requires device
> mapper and is synchronous)
In fact, NBD has nothing to do with device
On Friday 03 August 2007 07:53, Peter Zijlstra wrote:
> On Fri, 2007-08-03 at 17:49 +0400, Evgeniy Polyakov wrote:
> > On Fri, Aug 03, 2007 at 02:27:52PM +0200, Peter Zijlstra wrote:
> > ...my main position is to
> > allocate per socket reserve from socket's queue, and copy data
> > there from main
On Friday 03 August 2007 06:49, Evgeniy Polyakov wrote:
> ...rx has global reserve (always allocated on
> startup or sometime way before reclaim/oom)where data is originally
> received (including skb, shared info and whatever is needed, page is
> just an exmaple), then it is copied into per-socket
On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote:
> Hi.
>
> I'm pleased to announce first release of the distributed storage
> subsystem, which allows to form a storage on top of remote and local
> nodes, which in turn can be exported to another storage as a node to
> form tree-like storages.
Andrew Morton wrote:
handwaving
- The mmap(MAP_SHARED)-the-whole-world scenario should be fixed by
mm-tracking-shared-dirty-pages.patch. Please test it and if you are
still able to demonstrate deadlocks, describe how, and why they
are occurring.
OK, but please see "atomic 0 order alloc
Andrew Morton wrote:
...in my earlier emails I asked a number of questions regarding
whether existing facilities, queued patches or further slight kernel
changes could provide a sufficient solution to these problems. The answer
may well be "no". But diligence requires that we be able to prove t
Andrew Morton wrote:
Daniel Phillips wrote:
Andrew Morton wrote:
...it's runtime configurable.
So we default to "less than the best" because we are too lazy to fix the
network starvation issue properly? Maybe we don't really need a mempool for
struct bio either, isn
Daniel Phillips wrote:
Andrew Morton wrote:
Processes which are dirtying those pages throttle at
/proc/sys/vm/dirty_ratio% of memory dirty. So it is not possible to "fill"
memory with dirty pages. If the amount of physical memory which is dirty
exceeds 40%: bug.
So we make 400 MB
Andrew Morton wrote:
Daniel Phillips <[EMAIL PROTECTED]> wrote:
What happened to the case where we just fill memory full of dirty file
pages backed by a remote disk?
Processes which are dirtying those pages throttle at
/proc/sys/vm/dirty_ratio% of memory dirty. So it is not possible to
Evgeniy Polyakov wrote:
On Thu, Aug 17, 2006 at 09:15:14PM +0200, Peter Zijlstra ([EMAIL PROTECTED])
wrote:
I got openssh as example of situation when system does not know in
advance, what sockets must be marked as critical.
OpenSSH works with network and unix sockets in parallel, so you need
Evgeniy Polyakov wrote:
Just for clarification - it will be completely impossible to login using
openssh or some other priveledge separation protocol to the machine due
to the nature of unix sockets. So you will be unable to manage your
storage system just because it is in OOM - it is not what i
Evgeniy Polyakov wrote:
On Mon, Aug 14, 2006 at 08:45:43AM +0200, Peter Zijlstra ([EMAIL PROTECTED])
wrote:
Just pure openssh for control connection (admin should be able to
login).
These periods of degenerated functionality should be short and
infrequent albeit critical for machine recovery.
Evgeniy Polyakov wrote:
On Sun, Aug 13, 2006 at 01:16:15PM -0700, Daniel Phillips ([EMAIL PROTECTED])
wrote:
Indeed. The rest of the corner cases like netfilter, layered protocol and
so on need to be handled, however they do not need to be handled right now
in order to make remote storage on
Andrew Morton wrote:
What is a "socket wait queue" and how/why can it consume so much memory?
Two things:
1) sk_buffs in flight between device receive interrupt and layer 3
protocol/socket identification.
2) sk_buffs queued onto a particular socket waiting for some task to
come
Andrew Morton wrote:
Peter Zijlstra <[EMAIL PROTECTED]> wrote:
Testcase:
Mount an NBD device as sole swap device and mmap > physical RAM, then
loop through touching pages only once.
Fix: don't try to swap over the network. Yes, there may be some scenarios
where people have no local storage,
David Miller wrote:
I think there is more profitability from a solution that really does
something about "network memory", and doesn't try to say "these
devices are special" or "these sockets are special". Special cases
generally suck.
We already limit and control TCP socket memory globally in
David Miller wrote:
From: Daniel Phillips <[EMAIL PROTECTED]>
David Miller wrote:
The reason is that there is no refcounting performed on these devices
when they are attached to the skb, for performance reasons, and thus
the device can be downed, the module for it removed, etc. long
Evgeniy Polyakov wrote:
One must receive a packet to determine if that packet must be dropped
until tricky hardware with header split capabilities or MMIO copying is
used. Peter uses special pool to get data from when system is in OOM (at
least in his latest patchset), so allocations are separate
Peter Zijlstra wrote:
On Sat, 2006-08-12 at 20:16 +0200, Indan Zupancic wrote:
What was missing or wrong in the old approach? Can't you use the new
approach, but use alloc_pages() instead of SROG?
Sorry if I bug you so, but I'm also trying to increase my knowledge here. ;-)
I'm almost sorry I
Peter Zijlstra wrote:
Jeff Garzik in his infinite wisdom spake thusly:
Peter Zijlstra wrote:
Index: linux-2.6/include/linux/gfp.h
===
--- linux-2.6.orig/include/linux/gfp.h 2006-08-12 12:56:06.0 +0200
+++ linux-2.6/includ
Rik van Riel wrote:
Thomas Graf wrote:
skb->dev is not guaranteed to still point to the "allocating" device
once the skb is freed again so reserve/unreserve isn't symmetric.
You'd need skb->alloc_dev or something.
There's another consequence of this property of the network
stack.
Every networ
David Miller wrote:
From: Peter Zijlstra <[EMAIL PROTECTED]>
Hmm, what does sk_buff::input_dev do? That seems to store the initial
device?
You can run grep on the tree just as easily as I can which is what I
did to answer this question. It only takes a few seconds of your
time to grep the sou
Peter Zijlstra wrote:
On Wed, 2006-08-09 at 16:54 -0700, David Miller wrote:
People are doing I/O over IP exactly for it's ubiquity and
flexibility. It seems a major limitation of the design if you cancel
out major components of this flexibility.
We're not, that was a bit of my own frustratio
David Miller wrote:
I think he's saying that he doesn't think your code is yet a
reasonable way to solve the problem, and therefore doesn't belong
upstream.
That is why it has not yet been submitted upstream. Respectfully, I
do not think that jgarzik has yet put in the work to know if this ant
Evgeniy Polyakov wrote:
On Tue, Aug 08, 2006 at 09:33:25PM +0200, Peter Zijlstra ([EMAIL PROTECTED])
wrote:
http://lwn.net/Articles/144273/
"Kernel Summit 2005: Convergence of network and storage paths"
We believe that an approach very much like today's patch set is
necessary for NBD, iSC
Jeff Garzik wrote:
Peter Zijlstra wrote:
Update the driver to make use of the netdev_alloc_skb() API and the
NETIF_F_MEMALLOC feature.
NETIF_F_MEMALLOC does not exist in the upstream tree... nor should it,
IMO.
Elaborate please. Do you think that all drivers should be updated to
fix the b
David Miller wrote:
From: Daniel Phillips <[EMAIL PROTECTED]>
>>Can you please characterize the conditions under which skb->dev changes
after the alloc? Are there writings on this subtlety?
The packet scheduler and classifier can redirect packets to different
devices, and ca
David Miller wrote:
From: Daniel Phillips <[EMAIL PROTECTED]>
David Miller wrote:
I think the new atomic operation that will seemingly occur on every
device SKB free is unacceptable.
Alternate suggestion?
Sorry, I have none. But you're unlikely to get your changes
considere
Hi Dave,
David Miller wrote:
I think the new atomic operation that will seemingly occur on every
device SKB free is unacceptable.
Alternate suggestion?
You also cannot modify netdev->flags in the lockless manner in which
you do, it must be done with the appropriate locking, such as holding
t
Thomas Graf wrote:
> skb->dev is not guaranteed to still point to the "allocating" device
once the skb is freed again so reserve/unreserve isn't symmetric.
You'd need skb->alloc_dev or something.
Can you please characterize the conditions under which skb->dev changes
after the alloc? Are ther
Stephen Hemminger wrote:
How much of this is just building special case support for large allocations
for jumbo frames? Wouldn't it make more sense to just fix those drivers to
do scatter and add the support hooks for that?
Short answer: none of it is. If it happens to handle jumbo frames nice
Indan Zupancic wrote:
Hello,
Saw the patch on lkml, and wondered about some things.
On Tue, August 8, 2006 21:33, Peter Zijlstra said:
+static inline void dev_unreserve_skb(struct net_device *dev)
+{
+ if (atomic_dec_return(&dev->rx_reserve_used) < 0)
+ atomic_inc(&dev->rx_r
Hi Harald,
You wrote:
On Tue, Jun 13, 2006 at 02:12:41PM -0700, I wrote:
This has the makings of a nice stable internal kernel api. Why do we want
to provide this nice stable internal api to proprietary modules?
because there is IMHO legally nothing we can do about it anyway.
Speaking as
Chase Venters wrote:
can you name some non-GPL non-proprietary modules we should be concerned
about?
You probably meant "non-GPL-compatible non-proprietary". If so, then by
definition there are none.
Regards,
Daniel
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the b
Brian F. G. Bidulock wrote:
Stephen,
On Tue, 13 Jun 2006, Stephen Hemminger wrote:
@@ -2176,3 +2279,13 @@ EXPORT_SYMBOL(sock_wake_async);
EXPORT_SYMBOL(sockfd_lookup);
EXPORT_SYMBOL(kernel_sendmsg);
EXPORT_SYMBOL(kernel_recvmsg);
+EXPORT_SYMBOL(kernel_bind);
+EXPORT_SYMBOL(kernel_listen);
+EX
On Tuesday 08 November 2005 10:13, Greg KH wrote:
> On Thu, Nov 03, 2005 at 07:55:38AM +1100, Herbert Xu wrote:
> > The recent rewrite of skb_copy_datagram_iovec broke the reception of
> > zero-size datagrams. This patch fixes it.
> >
> > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]>
> >
> > Pleas
Gah, this time the revised patch is included, not just the diffstat.
datagram.c | 82 +++--
1 files changed, 26 insertions(+), 56 deletions(-)
diff -up --recursive 2.6.12.3.clean/net/core/datagram.c
2.6.12.3/net/core/datagram.c
--- 2.6.1
On Thursday 25 August 2005 03:30, David S. Miller wrote:
> From: Daniel Phillips <[EMAIL PROTECTED]>
> > As far as I can see, it is illegal for any but the first skb to have
> > a non-null skb_shinfo(skb)->frag_list, is this correct?
>
> As currently used, yes.
T
The fragment list handling was wrong in the previous version, now correct I
think.
datagram.c | 82 +++--
1 files changed, 26 insertions(+), 56 deletions(-)
diff -up --recursive 2.6.12.3.clean/net/core/datagram.c
2.6.12.3/net/core/datagra
On Thursday 25 August 2005 02:44, David S. Miller wrote:
> Frag lists cannot be deeper than one level of nesting,
> and I think the recursive version is easier to understand,
> so I really don't see the value of your change.
Losing 34 lines of a 74 line function is the value.
The real problem wit
Hi,
I noticed that skb_copy_datagram_iovec calls itself recursively to copy a
fragment list. This isn't actually wrong or even inefficient, it is just
somehow disturbing. Oh, and it uses an extra stack frame, and is hard to
read. Once I got started straightening that out, I couldn't resist clea
On Thursday 18 August 2005 07:02, John Heffner wrote:
> 1) Bridging occurs at the link layer (ethernet); fragmenting occurs at the
> network layer (IP).
>
> 2) A lot of protocols set the Don't Fragment bit, so you can't always
> fragment anyway.
>
> What you might want to do is set it up as an IP r
Ahem:
+ }
adjust_memalloc_reserve(-netdev->memalloc_pages);
- }
Regards,
Daniel
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi,
This version corrects a couple of bugs previously noted and ties up some loose
ends in the e1000 driver. Some versions of this driver support packet
splitting into multiple pages, with just the protocol header in the skb
itself. This is a very good thing because it avoids the high order page
Hi,
A couple of goofs. First, the sysctl interface to min_free_kbytes could stomp
on any in-kernel adjustments. Now there are two variables, summed in
setup_per_zone_pages_min: min_free_kbytes and var_free_kbytes. The
adjust_memalloc_reserve operates only the latter, so the user can freely
twid
Hi,
This version introduces the idea of having a network driver adjust the global
memalloc reserve when it brings an interface up or down. The interface is:
int adjust_memalloc_reserve(int bytes)
which is just a thin shell over the min_free_kbytes interface that already
exists.
The d
Hi,
This patch fills in some missing pieces:
* Support v4 udp: same as v4 tcp, when in reserve, drop packets on
noncritical sockets
* Support v4 icmp: when in reserve, drop icmp traffic
* Add reserve skb support to e1000 driver
* API for dropping packets before delivery (dev_d
On Sunday 07 August 2005 06:26, Patrick McHardy wrote:
> > Anyway, do we not want BUG_ON(!atomic_read(&skb->users)) at the beginning
> > of kfree_skb, since we rely on it?
>
> Why do you care if skb->users is 0 or 1 in __kfree_skb()?
Because I am a neatness freak and I like to check things that in
Hi,
The way I read this, __kfree_skb will sometimes be called with ->users = 1 and
sometimes with ->users = 0, is that right?
static inline void kfree_skb(struct sk_buff *skb)
{
if (likely(atomic_read(&skb->users) == 1))
smp_rmb();
else if (likely(!atomic_dec_an
On Saturday 06 August 2005 18:40, David S. Miller wrote:
> From: Daniel Phillips <[EMAIL PROTECTED]>
> Date: Sat, 6 Aug 2005 04:52:07 +1000
>
> > So then there is no choice but to throttle the per-cpu ->input_pkt
> > queues.
>
> Make the driver support NAPI if
On Sunday 07 August 2005 02:07, Jeff Garzik wrote:
> > +static inline struct sk_buff *__dev_memalloc_skb(struct net_device *dev,
> > + unsigned length, int gfp_mask)
> > +{
> > + struct sk_buff *skb = __dev_alloc_skb(length, gfp_mask);
> > + if (skb)
> > + goto done;
> > + if (dev
On Saturday 06 August 2005 12:32, Steven Rostedt wrote:
> > > If you need to really get the data out, then the design should be
> > > changed. Have some return value showing the failure, check for
> > > oops_in_progress or whatever, and try again after turning interrupts
> > > back on, and getting
Hi,
This version does not do blatantly stupid things in hardware irq context, is
more efficient, and... wow the patch is smaller! (That never happens.)
I don't mark skbs as being allocated from reserve any more. That works, but
it is slightly bogus, because it doesn't matter which skb came from
On Sunday 07 August 2005 03:54, Daniel Phillips wrote:
> On Saturday 06 August 2005 18:40, David S. Miller wrote:
> > From: Daniel Phillips <[EMAIL PROTECTED]>
> > Date: Sat, 6 Aug 2005 04:52:07 +1000
> >
> > > So then there is no choice but to throttle
On Saturday 06 August 2005 11:22, Matt Mackall wrote:
> On Sat, Aug 06, 2005 at 01:51:22AM +0200, Andi Kleen wrote:
> > > But why are we in a hurry to dump the backlog on the floor? Why are we
> > > worrying about the performance of netpoll without the cable plugged in
> > > at all? We shouldn't be
On Saturday 06 August 2005 02:33, David S. Miller wrote:
> You can't call into the networking packet input path from
> hardware interrupt context, it simply is not allowed.
>
> And that's the context in which netif_rx() gets called.
Duh. I assumed we already were in softirq context here (but with
On Saturday 06 August 2005 02:33, David S. Miller wrote:
> You can't call into the networking packet input path from
> hardware interrupt context, it simply is not allowed.
>
> And that's the context in which netif_rx() gets called.
Duh. I assumed we already were in softirq context here (but with
On Saturday 06 August 2005 03:49, Dave Jones wrote:
> On Fri, Aug 05, 2005 at 01:20:59PM -0400, John W. Linville wrote:
> > On Sat, Aug 06, 2005 at 02:41:30AM +1000, Daniel Phillips wrote:
> > > On Friday 05 August 2005 13:04, Mateusz Berezecki wrote:
> > > > I acc
On Friday 05 August 2005 13:04, Mateusz Berezecki wrote:
> I accidentaly posted the patches as MIME attachments... its 5:03 am here
> already. Sorry guys.
> I can resubmit if you want. I just dont want do that now and not trash
> your mailboxes
Does anybody still care if patches are posted as atta
Hi,
OK, I am still a network klutz. The attached patch changes netif_rx to call
netif_receive_skb directly instead of going through softnet. It works with
my e1000 here, but eventually oopses under moderate load. I see that a few
drivers use netif_receive_skb directly, sometimes together wit
Whoops:
- return __dev_alloc_skb(length, gfp_mask | __GFP_MEMALLOC);
+ return __dev_alloc_skb(length, GFP_ATOMIC|__GFP_MEMALLOC);
Regards,
Daniel
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at
Hi,
I spent the last day mulling things over and doing research. It seems to me
that the patch as first posted is correct and solves the deadlock, except
that some uses of __GFP_MEMALLOC in __dev_alloc_skb may escape into contexts
where the reserve is not guaranteed to be reclaimed. It may be
1 - 100 of 109 matches
Mail list logo