Re: [PATCH v2] docs: proc.rst: meminfo: briefly describe gaps in memory accounting

2021-04-20 Thread Michal Hocko
On Tue 20-04-21 15:57:08, Michal Hocko wrote: [...] > Usual memory consumption is usually something like LRU pages + Slab > memory + kernel stack + vmalloc used + pcp. > > > But I know that KernelStack is allocated through vmalloc these days, > > and I don't know wh

Re: [PATCH v2] docs: proc.rst: meminfo: briefly describe gaps in memory accounting

2021-04-20 Thread Michal Hocko
Similarly, is Mlocked a subset of Unevictable? > > There is some attempt at explaining how these numbers fit together, but > it's outdated, and doesn't include Mlocked, Unevictable or KernelStack Agreed there is a lot of tribal knowledge or even misconceptions flying around and it will take much more work to put everything into shape. This is only one tiny step forward. -- Michal Hocko SUSE Labs

Re: [PATCH v2] docs: proc.rst: meminfo: briefly describe gaps in memory accounting

2021-04-20 Thread Michal Hocko
usage. > > > > Signed-off-by: Mike Rapoport > > Ooops, forgot to add Michal's Ack, sorry. Let's make it more explicit Acked-by: Michal Hocko Thanks! -- Michal Hocko SUSE Labs

Re: [mm, net-next v2] mm: net: memcg accounting for TCP rx zerocopy

2021-03-25 Thread Michal Hocko
On Thu 25-03-21 12:47:04, Johannes Weiner wrote: > On Thu, Mar 25, 2021 at 10:02:28AM +0100, Michal Hocko wrote: > > On Wed 24-03-21 15:49:15, Arjun Roy wrote: > > > On Wed, Mar 24, 2021 at 2:24 PM Johannes Weiner > > > wrote: > > > > > > > &g

Re: [mm, net-next v2] mm: net: memcg accounting for TCP rx zerocopy

2021-03-25 Thread Michal Hocko
On Wed 24-03-21 15:49:15, Arjun Roy wrote: > On Wed, Mar 24, 2021 at 2:24 PM Johannes Weiner wrote: > > > > On Wed, Mar 24, 2021 at 10:12:46AM +0100, Michal Hocko wrote: > > > On Tue 23-03-21 11:47:54, Arjun Roy wrote: > > > > On Tue, Mar 23,

Re: [mm, net-next v2] mm: net: memcg accounting for TCP rx zerocopy

2021-03-24 Thread Michal Hocko
re any reason why othe subsystems outside of networking couldn't claim their own callback? -- Michal Hocko SUSE Labs

Re: [mm, net-next v2] mm: net: memcg accounting for TCP rx zerocopy

2021-03-24 Thread Michal Hocko
On Tue 23-03-21 11:47:54, Arjun Roy wrote: > On Tue, Mar 23, 2021 at 7:34 AM Michal Hocko wrote: > > > > On Wed 17-03-21 18:12:55, Johannes Weiner wrote: > > [...] > > > Here is an idea of how it could work: > > > > > > struct pag

Re: [mm, net-next v2] mm: net: memcg accounting for TCP rx zerocopy

2021-03-23 Thread Michal Hocko
always valid when a page is freed this would be really a nice and useful abstraction because you wouldn't have to care about the specific type of page. But maybe I am just overlooking the real complexity there. -- Michal Hocko SUSE Labs

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 16:06:31, Yafang Shao wrote: > On Tue, Sep 22, 2020 at 3:27 PM Michal Hocko wrote: [...] > > What is the latency triggered by the memory reclaim? It should be mostly > > a clean page cache right as drop_caches only drops clean pages. Or is > > this more ab

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-22 Thread Michal Hocko
On Tue 22-09-20 12:20:52, Yafang Shao wrote: > On Mon, Sep 21, 2020 at 7:36 PM Michal Hocko wrote: > > > > On Mon 21-09-20 19:23:01, Yafang Shao wrote: > > > On Mon, Sep 21, 2020 at 7:05 PM Michal Hocko wrote: > > > > > > > > On Mon 21-09-20 18:55

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 19:23:01, Yafang Shao wrote: > On Mon, Sep 21, 2020 at 7:05 PM Michal Hocko wrote: > > > > On Mon 21-09-20 18:55:40, Yafang Shao wrote: > > > On Mon, Sep 21, 2020 at 4:12 PM Michal Hocko wrote: > > > > > > > > On Mon 21

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-21 Thread Michal Hocko
On Mon 21-09-20 18:55:40, Yafang Shao wrote: > On Mon, Sep 21, 2020 at 4:12 PM Michal Hocko wrote: > > > > On Mon 21-09-20 16:02:55, zangchun...@bytedance.com wrote: > > > From: Chunxin Zang > > > > > > In the cgroup v1, we have 'force_mepty' i

Re: [PATCH] mm/memcontrol: Add the drop_cache interface for cgroup v2

2020-09-21 Thread Michal Hocko
ory_max_write, > }, > { > + .name = "drop_cache", > + .flags = CFTYPE_NOT_ON_ROOT, > + .write = mem_cgroup_force_empty_write, > + }, > + { > .name = "events", > .flags = CFTYPE_NOT_ON_ROOT, > .file_offset = offsetof(struct mem_cgroup, events_file), > -- > 2.11.0 -- Michal Hocko SUSE Labs

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Michal Hocko
On Wed 17-06-20 05:23:21, Matthew Wilcox wrote: > On Wed, Jun 17, 2020 at 01:31:57PM +0200, Michal Hocko wrote: > > On Wed 17-06-20 04:08:20, Matthew Wilcox wrote: > > > If you call vfree() under > > > a spinlock, you're in trouble. in_atomic() only knows

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Michal Hocko
On Wed 17-06-20 04:08:20, Matthew Wilcox wrote: > On Wed, Jun 17, 2020 at 09:12:12AM +0200, Michal Hocko wrote: > > On Tue 16-06-20 17:37:11, Matthew Wilcox wrote: > > > Not just performance critical, but correctness critical. Since kvfree() > > > may allocate from the

Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Michal Hocko
locate from the vmalloc allocator, I really think that kvfree() > should assert that it's !in_atomic(). Otherwise we can get into trouble > if we end up calling vfree() and have to take the mutex. FWIW __vfree already checks for atomic context and put the work into a deferred context. So this should be safe. It should be used as a last resort, though. -- Michal Hocko SUSE Labs

Re: [PATCH v4 1/3] mm/slab: Use memzero_explicit() in kzfree()

2020-06-15 Thread Michal Hocko
;slab: introduce kzfree()") > Cc: sta...@vger.kernel.org > Signed-off-by: Waiman Long Acked-by: Michal Hocko Although I am not really sure this is a stable material. Is there any known instance where the memset was optimized out from kzfree? > --- > mm/slab_common.c | 2

Re: [PATCH v3 bpf] bpf: Try harder when allocating memory for large maps

2019-03-18 Thread Michal Hocko
not be created due to vmalloc unable to allocate 75497472B, > when the system's memory consumption (in MB) was the following: > > Total: 3942 Used: 837 (21.24%) Free: 138 Buffers: 239 Cached: 2727 > > Later analysis [1] by Michal Hocko showed that the vmalloc was not tryin

Re: [PATCH] bpf: Try harder when allocating memory for large maps

2019-03-11 Thread Michal Hocko
not be created due to vmalloc unable to allocate 75497472B, > when the system's memory consumption (in MB) was the following: > > Total: 3942 Used: 837 (21.24%) Free: 138 Buffers: 239 Cached: 2727 > > Later analysis [1] by Michal Hocko showed that the vmalloc was not tryin

Re: general protection fault in watchdog

2018-12-14 Thread Michal Hocko
On Fri 14-12-18 15:31:44, Dmitry Vyukov wrote: > On Fri, Dec 14, 2018 at 2:54 PM Michal Hocko wrote: > > > > On Fri 14-12-18 14:42:33, Dmitry Vyukov wrote: > > > On Fri, Dec 14, 2018 at 2:28 PM Michal Hocko wrote: > > > > > > > > On Fri 14-12-18

Re: [dm-devel] [PATCH v5] fault-injection: introduce kvmalloc fallback options

2018-04-27 Thread Michal Hocko
r, so there is a risk of harm IIUC and this is not much different than other fault injecting paths. -- Michal Hocko SUSE Labs

Re: [dm-devel] [PATCH v5] fault-injection: introduce kvmalloc fallback options

2018-04-26 Thread Michal Hocko
;t provided any argument that would explain why the kernel package cannot add a boot option. Maybe there are some but I do not see them right now. -- Michal Hocko SUSE Labs

Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 13:28:49, Mikulas Patocka wrote: > > > On Tue, 24 Apr 2018, Michal Hocko wrote: > > > On Tue 24-04-18 13:00:11, Mikulas Patocka wrote: > > > > > > > > > On Tue, 24 Apr 2018, Michal Hocko wrote: > > > &

Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 13:00:11, Mikulas Patocka wrote: > > > On Tue, 24 Apr 2018, Michal Hocko wrote: > > > On Tue 24-04-18 11:50:30, Mikulas Patocka wrote: > > > > > > > > > On Tue, 24 Apr 2018, Michal Hocko wrote: > > > &

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 10:12:42, Michal Hocko wrote: > On Tue 24-04-18 11:30:40, Mikulas Patocka wrote: > > > > > > On Tue, 24 Apr 2018, Michal Hocko wrote: > > > > > On Mon 23-04-18 20:25:15, Mikulas Patocka wrote: > > > > > > > Fixi

Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 11:50:30, Mikulas Patocka wrote: > > > On Tue, 24 Apr 2018, Michal Hocko wrote: > > > On Mon 23-04-18 20:06:16, Mikulas Patocka wrote: > > [...] > > > @@ -404,6 +405,12 @@ void *kvmalloc_node(size_t size, gfp_t f > > >*

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-24 Thread Michal Hocko
On Tue 24-04-18 11:30:40, Mikulas Patocka wrote: > > > On Tue, 24 Apr 2018, Michal Hocko wrote: > > > On Mon 23-04-18 20:25:15, Mikulas Patocka wrote: > > > > > Fixing __vmalloc code > > > is easy and it doesn't require cooperation with mai

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-24 Thread Michal Hocko
On Mon 23-04-18 20:25:15, Mikulas Patocka wrote: > > > On Mon, 23 Apr 2018, Michal Hocko wrote: > > > On Mon 23-04-18 10:06:08, Mikulas Patocka wrote: > > > > > > > He didn't want to fix vmalloc(GFP_NOIO) > > > > > > > > I do

Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG

2018-04-24 Thread Michal Hocko
; +#ifdef CONFIG_DEBUG_SG > +do_vmalloc: > +#endif > return __vmalloc_node_flags_caller(size, node, flags, > __builtin_return_address(0)); > } -- Michal Hocko SUSE Labs

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-23 Thread Michal Hocko
04:54:53PM -0400, Mikulas Patocka wrote: > > > > > On Fri, 20 Apr 2018, Michal Hocko wrote: > > > > > > No way. This is just wrong! First of all, you will explode most > > > > > > likely > > > > > > on many allocations of small sizes. S

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-23 Thread Michal Hocko
> > The testing people won't set it up. They install the "kernel-debug" > package and run the tests in it. > > If you introduce a hidden option that no one knows about, no one will use > it. then make sure people know about it. Fuzzers already do test fault injections. -- Michal Hocko SUSE Labs

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-22 Thread Michal Hocko
On Sat 21-04-18 07:47:57, Matthew Wilcox wrote: > On Fri, Apr 20, 2018 at 05:21:26PM -0400, Mikulas Patocka wrote: > > On Fri, 20 Apr 2018, Matthew Wilcox wrote: > > > On Fri, Apr 20, 2018 at 04:54:53PM -0400, Mikulas Patocka wrote: > > > > On Fri, 20 Apr 2018, Michal

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-20 Thread Michal Hocko
On Fri 20-04-18 06:41:36, Matthew Wilcox wrote: > On Fri, Apr 20, 2018 at 03:08:52PM +0200, Michal Hocko wrote: > > > In order to detect these bugs reliably I submit this patch that changes > > > kvmalloc to always use vmalloc if CONFIG_DEBUG_VM is turned on. > > >

Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM

2018-04-20 Thread Michal Hocko
CONFIG_DEBUG_VM tends to be enabled quite often. > Signed-off-by: Mikulas Patocka Nacked-by: Michal Hocko > --- > mm/util.c |2 ++ > 1 file changed, 2 insertions(+) > > Index: linux-2.6/mm/util.c > === > --- l

Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

2018-02-14 Thread Michal Hocko
On Wed 14-02-18 18:34:51, Jesper Dangaard Brouer wrote: > On Wed, 14 Feb 2018 16:06:40 +0100 > Michal Hocko wrote: > > > On Wed 14-02-18 22:17:34, Jason Wang wrote: > > > There're several implications after commit 0bf7800f1799 ("ptr_ring: > > > try vmal

Re: [PATCH net] bpf: cpumap: use GFP_KERNEL instead of GFP_ATOMIC in __cpu_map_entry_alloc()

2018-02-14 Thread Michal Hocko
-by: syzbot+1a240cdb1f4cc8881...@syzkaller.appspotmail.com > Fixes: 0bf7800f1799 ("ptr_ring: try vmalloc() when kmalloc() fails") > Cc: Michal Hocko > Cc: Daniel Borkmann > Cc: Matthew Wilcox > Cc: Jesper Dangaard Brouer > Cc: a...@linux-foundation.org > Cc: dhow

Re: WARNING in kvmalloc_node

2018-02-14 Thread Michal Hocko
On Wed 14-02-18 19:47:30, Jason Wang wrote: > > > On 2018年02月14日 17:28, Daniel Borkmann wrote: > > [ +Jason, +Jesper ] > > > > On 02/14/2018 09:43 AM, Michal Hocko wrote: > > > On Tue 13-02-18 18:55:33, Matthew Wilcox wrote: > > > > On Tue

Re: WARNING in kvmalloc_node

2018-02-14 Thread Michal Hocko
the MM people ;-) Yes. kvmalloc (the vmalloc part) doesn't support GFP_ATOMIC semantic. -- Michal Hocko SUSE Labs

Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-31 Thread Michal Hocko
On Tue 30-01-18 11:27:45, Andrew Morton wrote: > On Tue, 30 Jan 2018 15:01:04 +0100 Michal Hocko wrote: > > > > Well, this is not about syzkaller, it merely pointed out a potential > > > DoS... And that has to be addressed somehow. > > > > So how about th

Re: [patch 1/1] net/netfilter/x_tables.c: make allocation less aggressive

2018-01-31 Thread Michal Hocko
has been used for both kmalloc and vmalloc paths. So it is more a quick band aid than a longterm solution. [1] http://lkml.kernel.org/r/20180129165722.gf5...@breakpoint.cc -- Michal Hocko SUSE Labs

Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-30 Thread Michal Hocko
On Tue 30-01-18 15:01:11, Florian Westphal wrote: > > From d48e950f1b04f234b57b9e34c363bdcfec10aeee Mon Sep 17 00:00:00 2001 > > From: Michal Hocko > > Date: Tue, 30 Jan 2018 14:51:07 +0100 > > Subject: [PATCH] net/netfilter/x_tables.c: make allocation less aggressiv

Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-30 Thread Michal Hocko
On Tue 30-01-18 10:57:39, Michal Hocko wrote: > On Tue 30-01-18 10:02:34, Dmitry Vyukov wrote: > > On Tue, Jan 30, 2018 at 9:28 AM, Kirill A. Shutemov > > wrote: > > > On Tue, Jan 30, 2018 at 09:11:27AM +0100, Florian Westphal wrote: > > >> Michal Hocko wrot

Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-30 Thread Michal Hocko
On Tue 30-01-18 10:02:34, Dmitry Vyukov wrote: > On Tue, Jan 30, 2018 at 9:28 AM, Kirill A. Shutemov > wrote: > > On Tue, Jan 30, 2018 at 09:11:27AM +0100, Florian Westphal wrote: > >> Michal Hocko wrote: > >> > On Mon 29-01-18 23:35:22, Florian Westphal wrote: &

Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-30 Thread Michal Hocko
On Tue 30-01-18 09:11:27, Florian Westphal wrote: > Michal Hocko wrote: > > On Mon 29-01-18 23:35:22, Florian Westphal wrote: > > > Kirill A. Shutemov wrote: > > [...] > > > > I hate what I'm saying, but I guess we need some tunable here. > > >

Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-29 Thread Michal Hocko
#x27;t resolve it other than kill all tasks in the affected memcg/container. Whether this is sufficient or not, I dunno. It sounds quite suboptimal to me. But it is true this would be less tricky then adding a global knob... -- Michal Hocko SUSE Labs

Re: [netfilter-core] kernel panic: Out of memory and no killable processes... (2)

2018-01-29 Thread Michal Hocko
t; Just supressing OOM kill is a bad idea. We still leave a way to allocate > > arbitrary large buffer in kernel. > > Isn't that what we do everywhere in network stack? > > I think we should try to allocate whatever amount of memory is needed > for the given xtables ruleset, given that is what admin requested us to do. If this is a root only thing then __GFP_NORETRY sounds like the most straightforward way to go. -- Michal Hocko SUSE Labs

Re: scheduling while atomic from vmci_transport_recv_stream_cb in 3.16 kernels

2017-09-17 Thread Michal Hocko
On Fri 15-09-17 18:12:15, Ben Hutchings wrote: > On Thu, 2017-09-14 at 10:59 +0200, Michal Hocko wrote: > > On Wed 13-09-17 18:58:13, Jorgen S. Hansen wrote: > > [...] > > > The patch series look good to me. > > > > Thanks for double checking. Ben, could you

Re: scheduling while atomic from vmci_transport_recv_stream_cb in 3.16 kernels

2017-09-14 Thread Michal Hocko
On Wed 13-09-17 18:58:13, Jorgen S. Hansen wrote: [...] > The patch series look good to me. Thanks for double checking. Ben, could you merge this to 3.16 stable branch, please? -- Michal Hocko SUSE Labs

[PATCH stable-3.16 2/3] VSOCK: Fix lockdep issue.

2017-09-13 Thread Michal Hocko
-by: Thomas Hellstrom Signed-off-by: Jorgen Hansen Signed-off-by: David S. Miller Signed-off-by: Michal Hocko --- net/vmw_vsock/vmci_transport.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c index aed136d27b01

[PATCH stable-3.16 1/3] VSOCK: sock_put wasn't safe to call in interrupt context

2017-09-13 Thread Michal Hocko
igned-off-by: David S. Miller Signed-off-by: Michal Hocko --- net/vmw_vsock/vmci_transport.c | 173 - net/vmw_vsock/vmci_transport.h | 4 +- 2 files changed, 86 insertions(+), 91 deletions(-) diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_

[PATCH stable-3.16 3/3] VSOCK: Detach QP check should filter out non matching QPs.

2017-09-13 Thread Michal Hocko
would cause an active stream socket to register a detach. Reviewed-by: George Zhang Signed-off-by: Jorgen Hansen Signed-off-by: David S. Miller Signed-off-by: Michal Hocko --- net/vmw_vsock/vmci_transport.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/net/vmw_vsock

Re: scheduling while atomic from vmci_transport_recv_stream_cb in 3.16 kernels

2017-09-13 Thread Michal Hocko
On Wed 13-09-17 15:07:26, Jorgen S. Hansen wrote: > > > On Sep 12, 2017, at 11:08 AM, Michal Hocko wrote: > > > > Hi, > > we are seeing the following splat with Debian 3.16 stable kernel > > > > BUG: scheduling while atomic: MATLAB/26771/0x0100 >

scheduling while atomic from vmci_transport_recv_stream_cb in 3.16 kernels

2017-09-12 Thread Michal Hocko
F_VSOCK); diff --git a/net/vmw_vsock/vmci_transport.h b/net/vmw_vsock/vmci_transport.h index ce6c9623d5f0..2ad46f39649f 100644 --- a/net/vmw_vsock/vmci_transport.h +++ b/net/vmw_vsock/vmci_transport.h @@ -119,10 +119,12 @@ struct vmci_transport { u64 queue_pair_size; u64 queue_pair_min_size; u64 queue_pair_max_size; - u32 attach_sub_id; u32 detach_sub_id; union vmci_transport_notify notify; struct vmci_transport_notify_ops *notify_ops; + struct list_head elem; + struct sock *sk; + spinlock_t lock; /* protects sk. */ }; int vmci_transport_register(void); -- Michal Hocko SUSE Labs

Re: [PATCH] vmalloc: respect the GFP_NOIO and GFP_NOFS flags

2017-07-04 Thread Michal Hocko
On Mon 03-07-17 18:57:14, Mikulas Patocka wrote: > > > On Mon, 3 Jul 2017, Michal Hocko wrote: > > > We can add a warning (or move it from kvmalloc) and hope that the > > respective maintainers will fix those places properly. The reason I > > didn't add th

Re: [PATCH] vmalloc: respect the GFP_NOIO and GFP_NOFS flags

2017-07-02 Thread Michal Hocko
On Fri 30-06-17 20:36:12, Mikulas Patocka wrote: > > > On Fri, 30 Jun 2017, Michal Hocko wrote: > > > On Fri 30-06-17 14:11:57, Mikulas Patocka wrote: > > > > > > > > > On Fri, 30 Jun 2017, Michal Hocko wrote: > > > > > > &

Re: [PATCH] vmalloc: respect the GFP_NOIO and GFP_NOFS flags

2017-06-30 Thread Michal Hocko
On Fri 30-06-17 14:11:57, Mikulas Patocka wrote: > > > On Fri, 30 Jun 2017, Michal Hocko wrote: > > > On Thu 29-06-17 22:25:09, Mikulas Patocka wrote: > > > The __vmalloc function has a parameter gfp_mask with the allocation flags, > > > however it do

Re: [PATCH] vmalloc: respect the GFP_NOIO and GFP_NOFS flags

2017-06-30 Thread Michal Hocko
allocate data pages and auxillary structures (e.g. > - * pagetables) with GFP_KERNEL, yet we may be under GFP_NOFS context > - * here. Hence we need to tell memory reclaim that we are in such a > - * context via PF_MEMALLOC_NOFS to prevent memory reclaim re-entering > - * the filesystem here and potentially deadlocking. > - */ > - if (flags & KM_NOFS) > - nofs_flag = memalloc_nofs_save(); > - > lflags = kmem_flags_convert(flags); > ptr = __vmalloc(size, lflags | __GFP_ZERO, PAGE_KERNEL); > > - if (flags & KM_NOFS) > - memalloc_nofs_restore(nofs_flag); > - > return ptr; > } > -- Michal Hocko SUSE Labs

Re: [PATCH] mm: convert three more cases to kvmalloc

2017-06-30 Thread Michal Hocko
On Thu 29-06-17 22:13:26, Mikulas Patocka wrote: > > > On Thu, 29 Jun 2017, Michal Hocko wrote: [...] > > > Index: linux-2.6/kernel/bpf/syscall.c > > > === > > > --- linux-2.6.orig/kernel/bpf/sysc

[PATCH] amd-xgbe: use PAGE_ALLOC_COSTLY_ORDER in xgbe_map_rx_buffer

2017-06-02 Thread Michal Hocko
From: Michal Hocko xgbe_map_rx_buffer is rather confused about what PAGE_ALLOC_COSTLY_ORDER means. It uses PAGE_ALLOC_COSTLY_ORDER-1 assuming that PAGE_ALLOC_COSTLY_ORDER is the first costly order which is not the case actually because orders larger than that are costly. And even that applies

Re: [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()

2017-04-06 Thread Michal Hocko
On Thu 06-04-17 09:33:44, Adrian Hunter wrote: > On 05/04/17 14:39, Vlastimil Babka wrote: > > On 04/05/2017 01:36 PM, Richard Weinberger wrote: > >> Michal, > >> > >> Am 05.04.2017 um 13:31 schrieb Michal Hocko: > >>> On Wed 05-04-17 09:47:0

Re: [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()

2017-04-05 Thread Michal Hocko
On Wed 05-04-17 13:39:16, Vlastimil Babka wrote: > On 04/05/2017 01:36 PM, Richard Weinberger wrote: > > Michal, > > > > Am 05.04.2017 um 13:31 schrieb Michal Hocko: > >> On Wed 05-04-17 09:47:00, Vlastimil Babka wrote: > >>> Nandsim has own functio

Re: [PATCH 4/4] mtd: nand: nandsim: convert to memalloc_noreclaim_*()

2017-04-05 Thread Michal Hocko
; > > err = get_pages(ns, file, count, pos); > if (err) > return err; > - memalloc = set_memalloc(); > + noreclaim_flag = memalloc_noreclaim_save(); > tx = kernel_write(file, buf, count, pos); > - clear_memalloc(memalloc); > + memalloc_noreclaim_restore(noreclaim_flag); > put_pages(ns); > return tx; > } > -- > 2.12.2 -- Michal Hocko SUSE Labs

Re: [PATCH 2/4] mm: introduce memalloc_noreclaim_{save,restore}

2017-04-05 Thread Michal Hocko
code, but the change makes it > more > robust. > > Suggested-by: Michal Hocko > Signed-off-by: Vlastimil Babka One could argue that tsk_restore_flags() could be extended to provide tsk_set_flags() and use it for all allocation related PF flags. I do not have a strong opinion on that

Re: [PATCH 3/4] treewide: convert PF_MEMALLOC manipulations to new helpers

2017-04-05 Thread Michal Hocko
nclude > @@ -372,14 +373,14 @@ EXPORT_SYMBOL_GPL(sk_clear_memalloc); > int __sk_backlog_rcv(struct sock *sk, struct sk_buff *skb) > { > int ret; > - unsigned long pflags = current->flags; > + unsigned int noreclaim_flag; > > /* these should have been dropped before queueing */ > BUG_ON(!sock_flag(sk, SOCK_MEMALLOC)); > > - current->flags |= PF_MEMALLOC; > + noreclaim_flag = memalloc_noreclaim_save(); > ret = sk->sk_backlog_rcv(sk, skb); > - tsk_restore_flags(current, pflags, PF_MEMALLOC); > + memalloc_noreclaim_restore(noreclaim_flag); > > return ret; > } > -- > 2.12.2 -- Michal Hocko SUSE Labs

Re: [PATCH 1/4] mm: prevent potential recursive reclaim due to clearing PF_MEMALLOC

2017-04-05 Thread Michal Hocko
ere is no such known context, but let's > play it safe and make __alloc_pages_direct_compact() robust for cases where > PF_MEMALLOC is already set. > > Fixes: a8161d1ed609 ("mm, page_alloc: restructure direct compaction handling > in slowpath") > Reporte

Re: [PATCH 0/6 v3] kvmalloc

2017-01-30 Thread Michal Hocko
On Mon 30-01-17 17:15:08, Daniel Borkmann wrote: > On 01/30/2017 08:56 AM, Michal Hocko wrote: > > On Fri 27-01-17 21:12:26, Daniel Borkmann wrote: > > > On 01/27/2017 11:05 AM, Michal Hocko wrote: > > > > On Thu 26-01-17 21:34:04, Daniel Borkmann wrote: > &g

[PATCH 6/9] net: use kvmalloc with __GFP_REPEAT rather than open coded variant

2017-01-30 Thread Michal Hocko
From: Michal Hocko fq_alloc_node, alloc_netdev_mqs and netif_alloc* open code kmalloc with vmalloc fallback. Use the kvmalloc variant instead. Keep the __GFP_REPEAT flag based on explanation from Eric: " At the time, tests on the hardware I had in my labs showed that vmalloc() could de

[PATCH 5/9] treewide: use kv[mz]alloc* rather than opencoded variants

2017-01-30 Thread Michal Hocko
From: Michal Hocko There are many code paths opencoding kvmalloc. Let's use the helper instead. The main difference to kvmalloc is that those users are usually not considering all the aspects of the memory allocator. E.g. allocation requests <= 32kB (with 4kB pages) are basically never

Re: [PATCH 0/6 v3] kvmalloc

2017-01-30 Thread Michal Hocko
On Fri 27-01-17 21:12:26, Daniel Borkmann wrote: > On 01/27/2017 11:05 AM, Michal Hocko wrote: > > On Thu 26-01-17 21:34:04, Daniel Borkmann wrote: [...] > > > So to answer your second email with the bpf and netfilter hunks, why > > > not replacing them with kvmalloc()

Re: [PATCH 0/6 v3] kvmalloc

2017-01-27 Thread Michal Hocko
On Thu 26-01-17 21:34:04, Daniel Borkmann wrote: > On 01/26/2017 02:40 PM, Michal Hocko wrote: [...] > > But realistically, how big is this problem really? Is it really worth > > it? You said this is an admin only interface and admin can kill the > > machine by OOM an

Re: [PATCH 0/6 v3] kvmalloc

2017-01-26 Thread Michal Hocko
On Thu 26-01-17 14:40:04, Michal Hocko wrote: > On Thu 26-01-17 14:10:06, Daniel Borkmann wrote: > > On 01/26/2017 12:58 PM, Michal Hocko wrote: > > > On Thu 26-01-17 12:33:55, Daniel Borkmann wrote: > > > > On 01/26/2017 11:08 AM, Michal Hocko wrote: > > >

Re: [PATCH 0/6 v3] kvmalloc

2017-01-26 Thread Michal Hocko
On Thu 26-01-17 14:10:06, Daniel Borkmann wrote: > On 01/26/2017 12:58 PM, Michal Hocko wrote: > > On Thu 26-01-17 12:33:55, Daniel Borkmann wrote: > > > On 01/26/2017 11:08 AM, Michal Hocko wrote: > > [...] > > > > If you disagree I can drop the bpf part o

Re: [PATCH 0/6 v3] kvmalloc

2017-01-26 Thread Michal Hocko
On Thu 26-01-17 04:14:37, Joe Perches wrote: > On Thu, 2017-01-26 at 11:32 +0100, Michal Hocko wrote: > > So I have folded the following to the patch 1. It is in line with > > kvmalloc and hopefully at least tell more than the current code. > [] > > diff --git a/mm/

Re: [PATCH 0/6 v3] kvmalloc

2017-01-26 Thread Michal Hocko
On Thu 26-01-17 12:33:55, Daniel Borkmann wrote: > On 01/26/2017 11:08 AM, Michal Hocko wrote: [...] > > If you disagree I can drop the bpf part of course... > > If we could consolidate these spots with kvmalloc() eventually, I'm > all for it. But even if __GFP_NORETRY is

Re: [PATCH 0/6 v3] kvmalloc

2017-01-26 Thread Michal Hocko
On Thu 26-01-17 12:04:13, Daniel Borkmann wrote: > On 01/26/2017 11:32 AM, Michal Hocko wrote: > > On Thu 26-01-17 11:08:02, Michal Hocko wrote: > > > On Thu 26-01-17 10:36:49, Daniel Borkmann wrote: > > > > On 01/26/2017 08:43 AM, Michal Hocko wrote: > > &g

Re: [PATCH 0/6 v3] kvmalloc

2017-01-26 Thread Michal Hocko
On Thu 26-01-17 11:08:02, Michal Hocko wrote: > On Thu 26-01-17 10:36:49, Daniel Borkmann wrote: > > On 01/26/2017 08:43 AM, Michal Hocko wrote: > > > On Wed 25-01-17 21:16:42, Daniel Borkmann wrote: > [...] > > > > I assume that kvzalloc() is still the same from [

Re: [PATCH 0/6 v3] kvmalloc

2017-01-26 Thread Michal Hocko
On Thu 26-01-17 10:36:49, Daniel Borkmann wrote: > On 01/26/2017 08:43 AM, Michal Hocko wrote: > > On Wed 25-01-17 21:16:42, Daniel Borkmann wrote: [...] > > > I assume that kvzalloc() is still the same from [1], right? If so, then > > > it would unfortunately (parti

Re: [PATCH 0/6 v3] kvmalloc

2017-01-25 Thread Michal Hocko
On Wed 25-01-17 21:16:42, Daniel Borkmann wrote: > On 01/25/2017 07:14 PM, Alexei Starovoitov wrote: > > On Wed, Jan 25, 2017 at 5:21 AM, Michal Hocko wrote: > > > On Wed 25-01-17 14:10:06, Michal Hocko wrote: > > > > On Tue 24-01-17 11:17:21, Alexei Starovoitov

Re: [PATCH 5/6] treewide: use kv[mz]alloc* rather than opencoded variants

2017-01-25 Thread Michal Hocko
On Wed 25-01-17 12:15:59, Vlastimil Babka wrote: > On 01/24/2017 04:00 PM, Michal Hocko wrote: > > > > Well, I am not opposed to kvmalloc_array but I would argue that this > > > > conversion cannot introduce new overflow issues. The code would have > > > > t

Re: [PATCH 5/6] treewide: use kv[mz]alloc* rather than opencoded variants

2017-01-24 Thread Michal Hocko
On Fri 20-01-17 14:41:37, Vlastimil Babka wrote: > On 01/12/2017 06:37 PM, Michal Hocko wrote: > > On Thu 12-01-17 09:26:09, Kees Cook wrote: > >> On Thu, Jan 12, 2017 at 7:37 AM, Michal Hocko wrote: > > [...] > >>> diff --git a/arch/s390/kvm/kvm-s390.c b/

Re: Potential issues (security and otherwise) with the current cgroup-bpf API

2017-01-19 Thread Michal Hocko
On Wed 18-01-17 14:18:50, Tejun Heo wrote: > Hello, Michal. > > On Tue, Jan 17, 2017 at 02:58:30PM +0100, Michal Hocko wrote: > > This would require using hierarchical cgroup iterators to iterate over > > It does behave hierarchically. > > > tasks. As per Andy

Re: Potential issues (security and otherwise) with the current cgroup-bpf API

2017-01-17 Thread Michal Hocko
On Tue 17-01-17 14:32:04, Peter Zijlstra wrote: > On Tue, Jan 17, 2017 at 02:03:03PM +0100, Michal Hocko wrote: > > On Sun 15-01-17 20:19:01, Tejun Heo wrote: > > [...] > > > So, what's proposed is a proper part of bpf. In terms of > > > implementation, cg

Re: Potential issues (security and otherwise) with the current cgroup-bpf API

2017-01-17 Thread Michal Hocko
shouldn't we at least enforce that the cgroup has to be a leaf one and no further children groups can be created once there is BPF program attached? This should break the existing usecases AFAIU and it would allow future changes without major API surprises. -- Michal Hocko SUSE Labs

Re: [PATCH 5/6] treewide: use kv[mz]alloc* rather than opencoded variants

2017-01-15 Thread Michal Hocko
id *rtn; - - rtn = kzalloc(size, GFP_KERNEL | __GFP_NOWARN); - if (!rtn) - rtn = vzalloc(size); - return rtn; + return kvzalloc(GFP_KERNEL, size); } static inline u32 mlx5_base_mkey(const u32 key) -- Michal Hocko SUSE Labs

Re: [PATCH 5/6] treewide: use kv[mz]alloc* rather than opencoded variants

2017-01-12 Thread Michal Hocko
On Thu 12-01-17 09:26:09, Kees Cook wrote: > On Thu, Jan 12, 2017 at 7:37 AM, Michal Hocko wrote: [...] > > diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c > > index 4f74511015b8..e6bbb33d2956 100644 > > --- a/arch/s390/kvm/kvm-s390.c > > +++

Re: [PATCH 5/6] treewide: use kv[mz]alloc* rather than opencoded variants

2017-01-12 Thread Michal Hocko
Ilya has noticed that I've screwed up some k[zc]alloc conversions and didn't use the kvzalloc. This is an updated patch with some acks collected on the way --- >From a7b89c6d0a3c685045e37740c8f97b065f37e0a4 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Wed, 4 Jan 2017 13:30:32

Re: [PATCH 5/6] treewide: use kv[mz]alloc* rather than opencoded variants

2017-01-12 Thread Michal Hocko
On Thu 12-01-17 17:54:34, Ilya Dryomov wrote: > On Thu, Jan 12, 2017 at 4:37 PM, Michal Hocko wrote: > > From: Michal Hocko > > > > There are many code paths opencoding kvmalloc. Let's use the helper > > instead. The main difference to kvmalloc is that those users

[RFC PATCH 6/6] net: use kvmalloc with __GFP_REPEAT rather than open coded variant

2017-01-12 Thread Michal Hocko
From: Michal Hocko fq_alloc_node, alloc_netdev_mqs and netif_alloc* open code kmalloc with vmalloc fallback. Use the kvmalloc variant instead. Keep the __GFP_REPEAT flag based on explanation from Eric: " At the time, tests on the hardware I had in my labs showed that vmalloc() could de

[PATCH 5/6] treewide: use kv[mz]alloc* rather than opencoded variants

2017-01-12 Thread Michal Hocko
From: Michal Hocko There are many code paths opencoding kvmalloc. Let's use the helper instead. The main difference to kvmalloc is that those users are usually not considering all the aspects of the memory allocator. E.g. allocation requests < 64kB are basically never failing and in

Re: [Patch net] atm: remove an unnecessary loop

2017-01-12 Thread Michal Hocko
ruesize); > - atomic_add(skb->truesize, &sk->sk_wmem_alloc); > + skb = alloc_skb(size, GFP_KERNEL); > + if (skb) { > + pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize); > + atomic_add(skb->truesize, &sk->sk_wmem_alloc); > + } > return skb; > } > > -- > 2.5.5 -- Michal Hocko SUSE Labs

Re: net/atm: warning in alloc_tx/__might_sleep

2017-01-11 Thread Michal Hocko
On Wed 11-01-17 20:45:25, Michal Hocko wrote: > On Wed 11-01-17 09:37:06, Chas Williams wrote: > > On Mon, 2017-01-09 at 18:20 +0100, Andrey Konovalov wrote: > > > Hi! > > > > > > I've got the following error report while running the

Re: net/atm: warning in alloc_tx/__might_sleep

2017-01-11 Thread Michal Hocko
_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize); > atomic_add(skb->truesize, &sk->sk_wmem_alloc); Blee, this code is just horrendous. But the "fix" is obviously broken! schedule() is just a noop if you do not change the task state and what you are just asking for is a never failing non sleeping allocation - aka a busy loop in the kernel! -- Michal Hocko SUSE Labs

Re: [PATCH] treewide: fix semicolon.cocci warnings

2017-01-07 Thread Michal Hocko
nt size) > { > if (size < (SIZE_MAX / sizeof(unsigned int))) > - return kvmalloc(size * sizeof(unsigned int), GFP_KERNEL);; > + return kvmalloc(size * sizeof(unsigned int), GFP_KERNEL); > > return NULL; > -- Michal Hocko SUSE Labs

Re: Potential issues (security and otherwise) with the current cgroup-bpf API

2017-01-03 Thread Michal Hocko
he root.) > > So from what I understand the proposed cgroup is not in fact > hierarchical at all. > > @TJ, I thought you were enforcing all new cgroups to be properly > hierarchical, that would very much include this one. I would be interested in that as well. We have made that mistake in

[PATCH 1/2] mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER

2016-12-20 Thread Michal Hocko
From: Michal Hocko Andrey Konovalov has reported the following warning triggered by the syzkaller fuzzer. WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20 Kernel panic - not syncing: panic_on_warn set ... CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted

[PATCH 0/2 v2] mm, slab: consolidate KMALLOC_MAX_SIZE

2016-12-20 Thread Michal Hocko
Hi, this is the second version of the patchset previously posted here [1]. Alexei has insisted on the patches reordering which I've done in this series. I've also updated the changelog of the second patch to mention why KMALLOC_SHIFT_MAX has been used. Andrey has revealed a discrepancy between KMA

[PATCH 2/2] bpf: do not use KMALLOC_SHIFT_MAX

2016-12-20 Thread Michal Hocko
From: Michal Hocko 01b3f52157ff ("bpf: fix allocation warnings in bpf maps and integer overflow") has added checks for the maximum allocateable size. It (ab)used KMALLOC_SHIFT_MAX for that purpose. While this is not incorrect it is not very clean because we already have KMALLOC_MAX_SIZ

Re: [PATCH 1/2] bpf: do not use KMALLOC_SHIFT_MAX

2016-12-17 Thread Michal Hocko
On Fri 16-12-16 16:28:21, Alexei Starovoitov wrote: > On Sat, Dec 17, 2016 at 12:39:17AM +0100, Michal Hocko wrote: > > On Fri 16-12-16 15:23:42, Alexei Starovoitov wrote: > > > On Fri, Dec 16, 2016 at 11:02:35PM +0100, Michal Hocko wrote: > > > > On Fri 16-12-16 10:0

Re: [PATCH 1/2] bpf: do not use KMALLOC_SHIFT_MAX

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 15:23:42, Alexei Starovoitov wrote: > On Fri, Dec 16, 2016 at 11:02:35PM +0100, Michal Hocko wrote: > > On Fri 16-12-16 10:02:10, Alexei Starovoitov wrote: > > > On Thu, Dec 15, 2016 at 05:47:21PM +0100, Michal Hocko wrote: > > > > From: Michal Hocko

Re: [PATCH 1/2] bpf: do not use KMALLOC_SHIFT_MAX

2016-12-16 Thread Michal Hocko
On Fri 16-12-16 10:02:10, Alexei Starovoitov wrote: > On Thu, Dec 15, 2016 at 05:47:21PM +0100, Michal Hocko wrote: > > From: Michal Hocko > > > > 01b3f52157ff ("bpf: fix allocation warnings in bpf maps and integer > > overflow") has added checks for the

  1   2   >