from:"Daniel Phillips"

Re: [PATCH 03/29] mm: slb: add knowledge of reserve pages

2007-12-15 Thread Daniel Phillips

On Friday 14 December 2007 14:51, I wrote: > On Friday 14 December 2007 07:39, Peter Zijlstra wrote: > Note that false sharing of slab pages is still possible between two > unrelated writeout processes, both of which obey rules for their own > writeout path, but the pinned combination does not. Th

Re: [PATCH 03/29] mm: slb: add knowledge of reserve pages

2007-12-14 Thread Daniel Phillips

On Friday 14 December 2007 07:39, Peter Zijlstra wrote: > Restrict objects from reserve slabs (ALLOC_NO_WATERMARKS) to > allocation contexts that are entitled to it. This is done to ensure > reserve pages don't leak out and get consumed. Tighter definitions of "leak out" and "get consumed" would b

Re: [PATCH 04/29] mm: kmem_estimate_pages()

2007-12-14 Thread Daniel Phillips

On Friday 14 December 2007 07:39, Peter Zijlstra wrote: > Provide a method to get the upper bound on the pages needed to > allocate a given number of objects from a given kmem_cache. > > This lays the foundation for a generic reserve framework as presented > in a later patch in this series. This fr

Re: [PATCH 16/29] netvm: INET reserves.

2007-12-14 Thread Daniel Phillips

Hi Peter, sysctl_intvec_fragment, proc_dointvec_fragment, sysctl_intvec_fragment seem to suffer from cut-n-pastitis. Regards, Daniel -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/

Re: [PATCH 00/29] Swap over NFS -v15

2007-12-14 Thread Daniel Phillips

Hi Peter, A major feature of this patch set is the network receive deadlock avoidance, but there is quite a bit of stuff bundled with it, the NFS user accounting for a big part of the patch by itself. Is it possible to provide a before and after demonstration case for just the network receive

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-09-01 Thread Daniel Phillips

On Friday 31 August 2007 14:41, Alasdair G Kergon wrote: > On Thu, Aug 30, 2007 at 04:20:35PM -0700, Daniel Phillips wrote: > > Resubmitting a bio or submitting a dependent bio from > > inside a block driver does not need to be throttled because all > > resources required to

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-30 Thread Daniel Phillips

On Wednesday 29 August 2007 01:53, Evgeniy Polyakov wrote: > Then, if of course you will want, which I doubt, you can reread > previous mails and find that it was pointed to that race and > possibilities to solve it way too long ago. What still bothers me about your response is that, while you kno

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-28 Thread Daniel Phillips

On Tuesday 28 August 2007 10:54, Evgeniy Polyakov wrote: > On Tue, Aug 28, 2007 at 10:27:59AM -0700, Daniel Phillips ([EMAIL PROTECTED]) > wrote: > > > We do not care about one cpu being able to increase its counter > > > higher than the limit, such inaccuracy (maximum b

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-28 Thread Daniel Phillips

On Tuesday 28 August 2007 02:35, Evgeniy Polyakov wrote: > On Mon, Aug 27, 2007 at 02:57:37PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > Say Evgeniy, something I was curious about but forgot to ask you > > earlier... > > > > On Wednesday 08 August 2007 03

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-27 Thread Daniel Phillips

Say Evgeniy, something I was curious about but forgot to ask you earlier... On Wednesday 08 August 2007 03:17, Evgeniy Polyakov wrote: > ...All oerations are not atomic, since we do not care about precise > number of bios, but a fact, that we are close or close enough to the > limit. > ... in bi

Re: Block device throttling [Re: Distributed storage.]

2007-08-14 Thread Daniel Phillips

On Tuesday 14 August 2007 05:46, Evgeniy Polyakov wrote: > > The throttling of the virtual device must begin in > > generic_make_request and last to ->endio. You release the throttle > > of the virtual device at the point you remap the bio to an > > underlying device, which you have convinced your

Re: Block device throttling [Re: Distributed storage.]

2007-08-14 Thread Daniel Phillips

On Tuesday 14 August 2007 04:50, Evgeniy Polyakov wrote: > On Tue, Aug 14, 2007 at 04:35:43AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote: > > > > And it will not solve the deadlock problem in general. (Maybe

Re: Block device throttling [Re: Distributed storage.]

2007-08-14 Thread Daniel Phillips

On Tuesday 14 August 2007 04:30, Evgeniy Polyakov wrote: > > And it will not solve the deadlock problem in general. (Maybe it > > works for your virtual device, but I wonder...) If the virtual > > device allocates memory during generic_make_request then the memory > > needs to be throttled. > > D

Re: Block device throttling [Re: Distributed storage.]

2007-08-14 Thread Daniel Phillips

On Tuesday 14 August 2007 01:46, Evgeniy Polyakov wrote: > On Mon, Aug 13, 2007 at 06:04:06AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > Perhaps you never worried about the resources that the device > > mapper mapping function allocates to handle each bio and so did n

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 02:12, Jens Axboe wrote: > > It is a system wide problem. Every block device needs throttling, > > otherwise queues expand without limit. Currently, block devices > > that use the standard request library get a slipshod form of > > throttling for free in the form of limit

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 05:18, Evgeniy Polyakov wrote: > > Say you have a device mapper device with some physical device > > sitting underneath, the classic use case for this throttle code. > > Say 8,000 threads each submit an IO in parallel. The device mapper > > mapping function will be called

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 05:04, Evgeniy Polyakov wrote: > On Mon, Aug 13, 2007 at 04:04:26AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote: > > > > Oops, and there is also: > > > > > > > &

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 04:03, Evgeniy Polyakov wrote: > On Mon, Aug 13, 2007 at 03:12:33AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > > This is not a very good solution, since it requires all users of > > > the bios to know how to free it. > > > > No

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 01:23, Evgeniy Polyakov wrote: > On Sun, Aug 12, 2007 at 10:36:23PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > (previous incomplete message sent accidentally) > > > > On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote: > > >

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 01:14, Evgeniy Polyakov wrote: > > Oops, and there is also: > > > > 3) The bio throttle, which is supposed to prevent deadlock, can > > itself deadlock. Let me see if I can remember how it goes. > > > > * generic_make_request puts a bio in flight > > * the bio gets pas

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 03:22, Jens Axboe wrote: > I never compared the bio to struct page, I'd obviously agree that > shrinking struct page was a worthy goal and that it'd be ok to uglify > some code to do that. The same isn't true for struct bio. I thought I just said that. Regards, Daniel -

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 03:06, Jens Axboe wrote: > On Mon, Aug 13 2007, Daniel Phillips wrote: > > Of course not. Nothing I said stops endio from being called in the > > usual way as well. For this to work, endio just needs to know that > > one call means "end&quo

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 02:18, Evgeniy Polyakov wrote: > On Mon, Aug 13, 2007 at 02:08:57AM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: > > > But that idea fails as well, since reference counts and IO > > > completion are two completely seperate entities. So unl

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 02:13, Jens Axboe wrote: > On Mon, Aug 13 2007, Daniel Phillips wrote: > > On Monday 13 August 2007 00:45, Jens Axboe wrote: > > > On Mon, Aug 13 2007, Jens Axboe wrote: > > > > > You did not comment on the one about putting the bio &g

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 00:45, Jens Axboe wrote: > On Mon, Aug 13 2007, Jens Axboe wrote: > > > You did not comment on the one about putting the bio destructor > > > in the ->endio handler, which looks dead simple. The majority of > > > cases just use the default endio handler and the default > >

Re: Distributed storage.

2007-08-13 Thread Daniel Phillips

On Monday 13 August 2007 00:28, Jens Axboe wrote: > On Sun, Aug 12 2007, Daniel Phillips wrote: > > Right, that is done by bi_vcnt. I meant bi_max_vecs, which you can > > derive efficiently from BIO_POOL_IDX() provided the bio was > > allocated in the standard way. > >

Re: Block device throttling [Re: Distributed storage.]

2007-08-13 Thread Daniel Phillips

On Sunday 12 August 2007 22:36, I wrote: > Note! There are two more issues I forgot to mention earlier. Oops, and there is also: 3) The bio throttle, which is supposed to prevent deadlock, can itself deadlock. Let me see if I can remember how it goes. * generic_make_request puts a bio in fl

Re: Block device throttling [Re: Distributed storage.]

2007-08-12 Thread Daniel Phillips

(previous incomplete message sent accidentally) On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote: > On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe wrote: > > So, what did we decide? To bloat bio a bit (add a queue pointer) or > to use physical device limits? The latter requires to r

Re: Block device throttling [Re: Distributed storage.]

2007-08-12 Thread Daniel Phillips

On Wednesday 08 August 2007 02:54, Evgeniy Polyakov wrote: > On Tue, Aug 07, 2007 at 10:55:38PM +0200, Jens Axboe ([EMAIL PROTECTED]) wrote: > > So, what did we decide? To bloat bio a bit (add a queue pointer) or > to use physical device limits? The latter requires to replace all > occurence of bi

Re: Distributed storage.

2007-08-12 Thread Daniel Phillips

On Tuesday 07 August 2007 13:55, Jens Axboe wrote: > I don't like structure bloat, but I do like nice design. Overloading > is a necessary evil sometimes, though. Even today, there isn't enough > room to hold bi_rw and bi_flags in the same variable on 32-bit archs, > so that concern can be scratche

Re: [1/1] Block device throttling [Re: Distributed storage.]

2007-08-12 Thread Daniel Phillips

Hi Evgeniy, Sorry for not getting back to you right away, I was on the road with limited email access. Incidentally, the reason my mails to you keep bouncing is, your MTA is picky about my mailer's IP reversing to a real hostname. I will take care of that pretty soon, but for now my direct m

Re: Distributed storage.

2007-08-07 Thread Daniel Phillips

On Tuesday 07 August 2007 05:05, Jens Axboe wrote: > On Sun, Aug 05 2007, Daniel Phillips wrote: > > A simple way to solve the stable accounting field issue is to add a > > new pointer to struct bio that is owned by the top level submitter > > (normally generic_make_request b

Re: Distributed storage.

2007-08-05 Thread Daniel Phillips

On Sunday 05 August 2007 08:01, Evgeniy Polyakov wrote: > On Sun, Aug 05, 2007 at 01:06:58AM -0700, Daniel Phillips wrote: > > > DST original code worked as device mapper plugin too, but its two > > > additional allocations (io and clone) per block request ended up > >

Re: Distributed storage.

2007-08-05 Thread Daniel Phillips

On Sunday 05 August 2007 08:08, Evgeniy Polyakov wrote: > If we are sleeping in memory pool, then we already do not have memory > to complete previous requests, so we are in trouble. Not at all. Any requests in flight are guaranteed to get the resources they need to complete. This is guaranteed

Re: Distributed storage.

2007-08-05 Thread Daniel Phillips

On Saturday 04 August 2007 09:44, Evgeniy Polyakov wrote: > > On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote: > > > * storage can be formed on top of remote nodes and be > > > exported simultaneously (iSCSI is peer-to-peer only, NBD requires > > > device mapper and is synchronous) > > >

Re: Distributed storage.

2007-08-05 Thread Daniel Phillips

On Saturday 04 August 2007 09:37, Evgeniy Polyakov wrote: > On Fri, Aug 03, 2007 at 06:19:16PM -0700, I wrote: > > To be sure, I am not very proud of this throttling mechanism for > > various reasons, but the thing is, _any_ throttling mechanism no > > matter how sucky solves the deadlock problem.

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips

On Friday 03 August 2007 03:26, Evgeniy Polyakov wrote: > On Thu, Aug 02, 2007 at 02:08:24PM -0700, I wrote: > > I see bits that worry me, e.g.: > > > > + req = mempool_alloc(st->w->req_pool, GFP_NOIO); > > > > which seems to be callable in response to a local request, just the > > case w

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips

Hi Mike, On Thursday 02 August 2007 21:09, Mike Snitzer wrote: > But NBD's synchronous nature is actually an asset when coupled with > MD raid1 as it provides guarantees that the data has _really_ been > mirrored remotely. And bio completion doesn't? Regards, Daniel - To unsubscribe from this l

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips

Hi Evgeniy, Nit alert: On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote: > * storage can be formed on top of remote nodes and be exported > simultaneously (iSCSI is peer-to-peer only, NBD requires device > mapper and is synchronous) In fact, NBD has nothing to do with device

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips

On Friday 03 August 2007 07:53, Peter Zijlstra wrote: > On Fri, 2007-08-03 at 17:49 +0400, Evgeniy Polyakov wrote: > > On Fri, Aug 03, 2007 at 02:27:52PM +0200, Peter Zijlstra wrote: > > ...my main position is to > > allocate per socket reserve from socket's queue, and copy data > > there from main

Re: Distributed storage.

2007-08-03 Thread Daniel Phillips

On Friday 03 August 2007 06:49, Evgeniy Polyakov wrote: > ...rx has global reserve (always allocated on > startup or sometime way before reclaim/oom)where data is originally > received (including skb, shared info and whatever is needed, page is > just an exmaple), then it is copied into per-socket

Re: Distributed storage.

2007-08-02 Thread Daniel Phillips

On Tuesday 31 July 2007 10:13, Evgeniy Polyakov wrote: > Hi. > > I'm pleased to announce first release of the distributed storage > subsystem, which allows to form a storage on top of remote and local > nodes, which in turn can be exported to another storage as a node to > form tree-like storages.

Re: Network receive stall avoidance (was [PATCH 2/9] deadlock prevention core)

2006-08-18 Thread Daniel Phillips

Andrew Morton wrote: handwaving - The mmap(MAP_SHARED)-the-whole-world scenario should be fixed by mm-tracking-shared-dirty-pages.patch. Please test it and if you are still able to demonstrate deadlocks, describe how, and why they are occurring. OK, but please see "atomic 0 order alloc

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-18 Thread Daniel Phillips

Andrew Morton wrote: ...in my earlier emails I asked a number of questions regarding whether existing facilities, queued patches or further slight kernel changes could provide a sufficient solution to these problems. The answer may well be "no". But diligence requires that we be able to prove t

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-18 Thread Daniel Phillips

Andrew Morton wrote: Daniel Phillips wrote: Andrew Morton wrote: ...it's runtime configurable. So we default to "less than the best" because we are too lazy to fix the network starvation issue properly? Maybe we don't really need a mempool for struct bio either, isn&#x

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-17 Thread Daniel Phillips

Daniel Phillips wrote: Andrew Morton wrote: Processes which are dirtying those pages throttle at /proc/sys/vm/dirty_ratio% of memory dirty. So it is not possible to "fill" memory with dirty pages. If the amount of physical memory which is dirty exceeds 40%: bug. So we make 400 MB

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-17 Thread Daniel Phillips

Andrew Morton wrote: Daniel Phillips <[EMAIL PROTECTED]> wrote: What happened to the case where we just fill memory full of dirty file pages backed by a remote disk? Processes which are dirtying those pages throttle at /proc/sys/vm/dirty_ratio% of memory dirty. So it is not possible to

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-17 Thread Daniel Phillips

Evgeniy Polyakov wrote: On Thu, Aug 17, 2006 at 09:15:14PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: I got openssh as example of situation when system does not know in advance, what sockets must be marked as critical. OpenSSH works with network and unix sockets in parallel, so you need

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-17 Thread Daniel Phillips

Evgeniy Polyakov wrote: Just for clarification - it will be completely impossible to login using openssh or some other priveledge separation protocol to the machine due to the nature of unix sockets. So you will be unable to manage your storage system just because it is in OOM - it is not what i

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-16 Thread Daniel Phillips

Evgeniy Polyakov wrote: On Mon, Aug 14, 2006 at 08:45:43AM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: Just pure openssh for control connection (admin should be able to login). These periods of degenerated functionality should be short and infrequent albeit critical for machine recovery.

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-16 Thread Daniel Phillips

Evgeniy Polyakov wrote: On Sun, Aug 13, 2006 at 01:16:15PM -0700, Daniel Phillips ([EMAIL PROTECTED]) wrote: Indeed. The rest of the corner cases like netfilter, layered protocol and so on need to be handled, however they do not need to be handled right now in order to make remote storage on

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-16 Thread Daniel Phillips

Andrew Morton wrote: What is a "socket wait queue" and how/why can it consume so much memory? Two things: 1) sk_buffs in flight between device receive interrupt and layer 3 protocol/socket identification. 2) sk_buffs queued onto a particular socket waiting for some task to come

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-16 Thread Daniel Phillips

Andrew Morton wrote: Peter Zijlstra <[EMAIL PROTECTED]> wrote: Testcase: Mount an NBD device as sole swap device and mmap > physical RAM, then loop through touching pages only once. Fix: don't try to swap over the network. Yes, there may be some scenarios where people have no local storage,

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-13 Thread Daniel Phillips

David Miller wrote: I think there is more profitability from a solution that really does something about "network memory", and doesn't try to say "these devices are special" or "these sockets are special". Special cases generally suck. We already limit and control TCP socket memory globally in

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-13 Thread Daniel Phillips

David Miller wrote: From: Daniel Phillips <[EMAIL PROTECTED]> David Miller wrote: The reason is that there is no refcounting performed on these devices when they are attached to the skb, for performance reasons, and thus the device can be downed, the module for it removed, etc. long

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Daniel Phillips

Evgeniy Polyakov wrote: One must receive a packet to determine if that packet must be dropped until tricky hardware with header split capabilities or MMIO copying is used. Peter uses special pool to get data from when system is in OOM (at least in his latest patchset), so allocations are separate

Re: [RFC][PATCH 0/4] VM deadlock prevention -v4

2006-08-13 Thread Daniel Phillips

Peter Zijlstra wrote: On Sat, 2006-08-12 at 20:16 +0200, Indan Zupancic wrote: What was missing or wrong in the old approach? Can't you use the new approach, but use alloc_pages() instead of SROG? Sorry if I bug you so, but I'm also trying to increase my knowledge here. ;-) I'm almost sorry I

Re: rename *MEMALLOC flags

2006-08-13 Thread Daniel Phillips

Peter Zijlstra wrote: Jeff Garzik in his infinite wisdom spake thusly: Peter Zijlstra wrote: Index: linux-2.6/include/linux/gfp.h === --- linux-2.6.orig/include/linux/gfp.h 2006-08-12 12:56:06.0 +0200 +++ linux-2.6/includ

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-13 Thread Daniel Phillips

Rik van Riel wrote: Thomas Graf wrote: skb->dev is not guaranteed to still point to the "allocating" device once the skb is freed again so reserve/unreserve isn't symmetric. You'd need skb->alloc_dev or something. There's another consequence of this property of the network stack. Every networ

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-13 Thread Daniel Phillips

David Miller wrote: From: Peter Zijlstra <[EMAIL PROTECTED]> Hmm, what does sk_buff::input_dev do? That seems to store the initial device? You can run grep on the tree just as easily as I can which is what I did to answer this question. It only takes a few seconds of your time to grep the sou

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-13 Thread Daniel Phillips

Peter Zijlstra wrote: On Wed, 2006-08-09 at 16:54 -0700, David Miller wrote: People are doing I/O over IP exactly for it's ubiquity and flexibility. It seems a major limitation of the design if you cancel out major components of this flexibility. We're not, that was a bit of my own frustratio

Re: [RFC][PATCH 8/9] 3c59x driver conversion

2006-08-13 Thread Daniel Phillips

David Miller wrote: I think he's saying that he doesn't think your code is yet a reasonable way to solve the problem, and therefore doesn't belong upstream. That is why it has not yet been submitted upstream. Respectfully, I do not think that jgarzik has yet put in the work to know if this ant

Re: [RFC][PATCH 0/9] Network receive deadlock prevention for NBD

2006-08-08 Thread Daniel Phillips

Evgeniy Polyakov wrote: On Tue, Aug 08, 2006 at 09:33:25PM +0200, Peter Zijlstra ([EMAIL PROTECTED]) wrote: http://lwn.net/Articles/144273/ "Kernel Summit 2005: Convergence of network and storage paths" We believe that an approach very much like today's patch set is necessary for NBD, iSC

Re: [RFC][PATCH 8/9] 3c59x driver conversion

2006-08-08 Thread Daniel Phillips

Jeff Garzik wrote: Peter Zijlstra wrote: Update the driver to make use of the netdev_alloc_skb() API and the NETIF_F_MEMALLOC feature. NETIF_F_MEMALLOC does not exist in the upstream tree... nor should it, IMO. Elaborate please. Do you think that all drivers should be updated to fix the b

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-08 Thread Daniel Phillips

David Miller wrote: From: Daniel Phillips <[EMAIL PROTECTED]> >>Can you please characterize the conditions under which skb->dev changes after the alloc? Are there writings on this subtlety? The packet scheduler and classifier can redirect packets to different devices, and ca

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-08 Thread Daniel Phillips

David Miller wrote: From: Daniel Phillips <[EMAIL PROTECTED]> David Miller wrote: I think the new atomic operation that will seemingly occur on every device SKB free is unacceptable. Alternate suggestion? Sorry, I have none. But you're unlikely to get your changes considere

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-08 Thread Daniel Phillips

Hi Dave, David Miller wrote: I think the new atomic operation that will seemingly occur on every device SKB free is unacceptable. Alternate suggestion? You also cannot modify netdev->flags in the lockless manner in which you do, it must be done with the appropriate locking, such as holding t

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-08 Thread Daniel Phillips

Thomas Graf wrote: > skb->dev is not guaranteed to still point to the "allocating" device once the skb is freed again so reserve/unreserve isn't symmetric. You'd need skb->alloc_dev or something. Can you please characterize the conditions under which skb->dev changes after the alloc? Are ther

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-08 Thread Daniel Phillips

Stephen Hemminger wrote: How much of this is just building special case support for large allocations for jumbo frames? Wouldn't it make more sense to just fix those drivers to do scatter and add the support hooks for that? Short answer: none of it is. If it happens to handle jumbo frames nice

Re: [RFC][PATCH 2/9] deadlock prevention core

2006-08-08 Thread Daniel Phillips

Indan Zupancic wrote: Hello, Saw the patch on lkml, and wondered about some things. On Tue, August 8, 2006 21:33, Peter Zijlstra said: +static inline void dev_unreserve_skb(struct net_device *dev) +{ + if (atomic_dec_return(&dev->rx_reserve_used) < 0) + atomic_inc(&dev->rx_r

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Daniel Phillips

Hi Harald, You wrote: On Tue, Jun 13, 2006 at 02:12:41PM -0700, I wrote: This has the makings of a nice stable internal kernel api. Why do we want to provide this nice stable internal api to proprietary modules? because there is IMHO legally nothing we can do about it anyway. Speaking as

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-13 Thread Daniel Phillips

Chase Venters wrote: can you name some non-GPL non-proprietary modules we should be concerned about? You probably meant "non-GPL-compatible non-proprietary". If so, then by definition there are none. Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe netdev" in the b

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-13 Thread Daniel Phillips

Brian F. G. Bidulock wrote: Stephen, On Tue, 13 Jun 2006, Stephen Hemminger wrote: @@ -2176,3 +2279,13 @@ EXPORT_SYMBOL(sock_wake_async); EXPORT_SYMBOL(sockfd_lookup); EXPORT_SYMBOL(kernel_sendmsg); EXPORT_SYMBOL(kernel_recvmsg); +EXPORT_SYMBOL(kernel_bind); +EXPORT_SYMBOL(kernel_listen); +EX

Re: [stable] [NET] Fix zero-size datagram reception

2005-11-08 Thread Daniel Phillips

On Tuesday 08 November 2005 10:13, Greg KH wrote: > On Thu, Nov 03, 2005 at 07:55:38AM +1100, Herbert Xu wrote: > > The recent rewrite of skb_copy_datagram_iovec broke the reception of > > zero-size datagrams. This patch fixes it. > > > > Signed-off-by: Herbert Xu <[EMAIL PROTECTED]> > > > > Pleas

[TESTME][PATCH] Make skb_copy_datagram_iovec nonrecursive (really revised)

2005-08-25 Thread Daniel Phillips

Gah, this time the revised patch is included, not just the diffstat. datagram.c | 82 +++-- 1 files changed, 26 insertions(+), 56 deletions(-) diff -up --recursive 2.6.12.3.clean/net/core/datagram.c 2.6.12.3/net/core/datagram.c --- 2.6.1

Re: [TESTME][PATCH] Make skb_copy_datagram_iovec nonrecursive

2005-08-25 Thread Daniel Phillips

On Thursday 25 August 2005 03:30, David S. Miller wrote: > From: Daniel Phillips <[EMAIL PROTECTED]> > > As far as I can see, it is illegal for any but the first skb to have > > a non-null skb_shinfo(skb)->frag_list, is this correct? > > As currently used, yes. T

[TESTME][PATCH] Make skb_copy_datagram_iovec nonrecursive (revised)

2005-08-25 Thread Daniel Phillips

The fragment list handling was wrong in the previous version, now correct I think. datagram.c | 82 +++-- 1 files changed, 26 insertions(+), 56 deletions(-) diff -up --recursive 2.6.12.3.clean/net/core/datagram.c 2.6.12.3/net/core/datagra

Re: [TESTME][PATCH] Make skb_copy_datagram_iovec nonrecursive

2005-08-25 Thread Daniel Phillips

On Thursday 25 August 2005 02:44, David S. Miller wrote: > Frag lists cannot be deeper than one level of nesting, > and I think the recursive version is easier to understand, > so I really don't see the value of your change. Losing 34 lines of a 74 line function is the value. The real problem wit

[TESTME][PATCH] Make skb_copy_datagram_iovec nonrecursive

2005-08-24 Thread Daniel Phillips

Hi, I noticed that skb_copy_datagram_iovec calls itself recursively to copy a fragment list. This isn't actually wrong or even inefficient, it is just somehow disturbing. Oh, and it uses an extra stack frame, and is hard to read. Once I got started straightening that out, I couldn't resist clea

Re: Bridge MTU

2005-08-17 Thread Daniel Phillips

On Thursday 18 August 2005 07:02, John Heffner wrote: > 1) Bridging occurs at the link layer (ethernet); fragmenting occurs at the > network layer (IP). > > 2) A lot of protocols set the Don't Fragment bit, so you can't always > fragment anyway. > > What you might want to do is set it up as an IP r

Re: [RFC] Net vm deadlock fix, version 6

2005-08-11 Thread Daniel Phillips

Ahem: + } adjust_memalloc_reserve(-netdev->memalloc_pages); - } Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html

[RFC] Net vm deadlock fix, version 6

2005-08-11 Thread Daniel Phillips

Hi, This version corrects a couple of bugs previously noted and ties up some loose ends in the e1000 driver. Some versions of this driver support packet splitting into multiple pages, with just the protocol header in the skb itself. This is a very good thing because it avoids the high order page

Re: [RFC] Net vm deadlock fix, version 5

2005-08-08 Thread Daniel Phillips

Hi, A couple of goofs. First, the sysctl interface to min_free_kbytes could stomp on any in-kernel adjustments. Now there are two variables, summed in setup_per_zone_pages_min: min_free_kbytes and var_free_kbytes. The adjust_memalloc_reserve operates only the latter, so the user can freely twid

[RFC] Net vm deadlock fix, version 5

2005-08-08 Thread Daniel Phillips

Hi, This version introduces the idea of having a network driver adjust the global memalloc reserve when it brings an interface up or down. The interface is: int adjust_memalloc_reserve(int bytes) which is just a thin shell over the min_free_kbytes interface that already exists. The d

[RFC] Net vm deadlock fix, version 4

2005-08-06 Thread Daniel Phillips

Hi, This patch fills in some missing pieces: * Support v4 udp: same as v4 tcp, when in reserve, drop packets on noncritical sockets * Support v4 icmp: when in reserve, drop icmp traffic * Add reserve skb support to e1000 driver * API for dropping packets before delivery (dev_d

Re: kfree_skb questions

2005-08-06 Thread Daniel Phillips

On Sunday 07 August 2005 06:26, Patrick McHardy wrote: > > Anyway, do we not want BUG_ON(!atomic_read(&skb->users)) at the beginning > > of kfree_skb, since we rely on it? > > Why do you care if skb->users is 0 or 1 in __kfree_skb()? Because I am a neatness freak and I like to check things that in

kfree_skb questions

2005-08-06 Thread Daniel Phillips

Hi, The way I read this, __kfree_skb will sometimes be called with ->users = 1 and sometimes with ->users = 0, is that right? static inline void kfree_skb(struct sk_buff *skb) { if (likely(atomic_read(&skb->users) == 1)) smp_rmb(); else if (likely(!atomic_dec_an

Re: test

2005-08-06 Thread Daniel Phillips

On Saturday 06 August 2005 18:40, David S. Miller wrote: > From: Daniel Phillips <[EMAIL PROTECTED]> > Date: Sat, 6 Aug 2005 04:52:07 +1000 > > > So then there is no choice but to throttle the per-cpu ->input_pkt > > queues. > > Make the driver support NAPI if

Re: [RFC] Net vm deadlock fix (take two)

2005-08-06 Thread Daniel Phillips

On Sunday 07 August 2005 02:07, Jeff Garzik wrote: > > +static inline struct sk_buff *__dev_memalloc_skb(struct net_device *dev, > > + unsigned length, int gfp_mask) > > +{ > > + struct sk_buff *skb = __dev_alloc_skb(length, gfp_mask); > > + if (skb) > > + goto done; > > + if (dev

Re: [PATCH] netpoll can lock up on low memory.

2005-08-06 Thread Daniel Phillips

On Saturday 06 August 2005 12:32, Steven Rostedt wrote: > > > If you need to really get the data out, then the design should be > > > changed. Have some return value showing the failure, check for > > > oops_in_progress or whatever, and try again after turning interrupts > > > back on, and getting

[RFC] Net vm deadlock fix (take two)

2005-08-06 Thread Daniel Phillips

Hi, This version does not do blatantly stupid things in hardware irq context, is more efficient, and... wow the patch is smaller! (That never happens.) I don't mark skbs as being allocated from reserve any more. That works, but it is slightly bogus, because it doesn't matter which skb came from

Re: test

2005-08-06 Thread Daniel Phillips

On Sunday 07 August 2005 03:54, Daniel Phillips wrote: > On Saturday 06 August 2005 18:40, David S. Miller wrote: > > From: Daniel Phillips <[EMAIL PROTECTED]> > > Date: Sat, 6 Aug 2005 04:52:07 +1000 > > > > > So then there is no choice but to throttle

Re: lockups with netconsole on e1000 on media insertion

2005-08-05 Thread Daniel Phillips

On Saturday 06 August 2005 11:22, Matt Mackall wrote: > On Sat, Aug 06, 2005 at 01:51:22AM +0200, Andi Kleen wrote: > > > But why are we in a hurry to dump the backlog on the floor? Why are we > > > worrying about the performance of netpoll without the cable plugged in > > > at all? We shouldn't be

test

2005-08-05 Thread Daniel Phillips

On Saturday 06 August 2005 02:33, David S. Miller wrote: > You can't call into the networking packet input path from > hardware interrupt context, it simply is not allowed. > > And that's the context in which netif_rx() gets called. Duh. I assumed we already were in softirq context here (but with

Re: Bypass softnet

2005-08-05 Thread Daniel Phillips

On Saturday 06 August 2005 02:33, David S. Miller wrote: > You can't call into the networking packet input path from > hardware interrupt context, it simply is not allowed. > > And that's the context in which netif_rx() gets called. Duh. I assumed we already were in softirq context here (but with

Re: argh... ;/

2005-08-05 Thread Daniel Phillips

On Saturday 06 August 2005 03:49, Dave Jones wrote: > On Fri, Aug 05, 2005 at 01:20:59PM -0400, John W. Linville wrote: > > On Sat, Aug 06, 2005 at 02:41:30AM +1000, Daniel Phillips wrote: > > > On Friday 05 August 2005 13:04, Mateusz Berezecki wrote: > > > > I acc

Re: argh... ;/

2005-08-05 Thread Daniel Phillips

On Friday 05 August 2005 13:04, Mateusz Berezecki wrote: > I accidentaly posted the patches as MIME attachments... its 5:03 am here > already. Sorry guys. > I can resubmit if you want. I just dont want do that now and not trash > your mailboxes Does anybody still care if patches are posted as atta

Bypass softnet

2005-08-05 Thread Daniel Phillips

Hi, OK, I am still a network klutz. The attached patch changes netif_rx to call netif_receive_skb directly instead of going through softnet. It works with my e1000 here, but eventually oopses under moderate load. I see that a few drivers use netif_receive_skb directly, sometimes together wit

Re: [RFC] Net vm deadlock fix (preliminary)

2005-08-04 Thread Daniel Phillips

Whoops: - return __dev_alloc_skb(length, gfp_mask | __GFP_MEMALLOC); + return __dev_alloc_skb(length, GFP_ATOMIC|__GFP_MEMALLOC); Regards, Daniel - To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to [EMAIL PROTECTED] More majordomo info at

Re: [RFC] Net vm deadlock fix (preliminary)

2005-08-04 Thread Daniel Phillips

Hi, I spent the last day mulling things over and doing research. It seems to me that the patch as first posted is correct and solves the deadlock, except that some uses of __GFP_MEMALLOC in __dev_alloc_skb may escape into contexts where the reserve is not guaranteed to be reclaimed. It may be

1 2 >

1 - 100 of 109 matches

Mail list logo