On 10.09.25 14:37, Arunpravin Paneer Selvam wrote: > Hi Christian, > > On 9/9/2025 9:55 PM, Christian König wrote: >> On 09.09.25 16:05, Peter Zijlstra wrote: >>> On Tue, Sep 09, 2025 at 02:04:30PM +0200, Christian König wrote: >>>> Hi Arun, >>>> >>>> On 09.09.25 11:56, Arunpravin Paneer Selvam wrote: >>>> [SNIP] >>>> >>>>> +/** >>>>> + * rbtree_for_each_entry_safe - iterate in-order over rb_root safe >>>>> against removal >>>>> + * >>>>> + * @pos: the 'type *' to use as a loop cursor >>>>> + * @n: another 'type *' to use as temporary storage >>>>> + * @root: 'rb_root *' of the rbtree >>>>> + * @member: the name of the rb_node field within 'type' >>>>> + */ >>>>> +#define rbtree_for_each_entry_safe(pos, n, root, member) \ >>>>> + for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member), \ >>>>> + (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), >>>>> typeof(*(pos)), member) : NULL; \ >>>>> + (pos); \ >>>>> + (pos) = (n), \ >>>>> + (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), >>>>> typeof(*(pos)), member) : NULL) >>>> As far as I know exactly that operation does not work on an R/B tree. >>>> >>>> See the _safe() variants of the for_each_ macros are usually used to >>>> iterate over a container while being able to remove entries. >>>> >>>> But because of the potential re-balance storing just the next entry is not >>>> sufficient for an R/B tree to do that as far as I know. >>>> >>>> Please explain how exactly you want to use this macro. > Thanks for the pointer, yes, this will not work on RB tree. We need a reverse > safe variant for use in the force_merge() function similar to the > list_for_each_entry_safe_reverse() macro in list.h. The reason is that in > force_merge(), we remove the block from the free tree before invoking > drm_buddy_free(), which merges and frees buddy blocks to form a larger block. >>> So I don't much like these iterators; I've said so before. Either we >>> should introduce a properly threaded rb-tree (where the NULL child >>> pointers encode a linked list), or simply keep a list_head next to the >>> rb_node and use that. >> I agree, something is clearly fishy here. >> >>> The rb_{next,prev}() things are O(ln n), in the worst case they do a >>> full traversal up the tree and a full traversal down the other branch. >> Yeah from the logic that is exactly what is supposed to happen in the >> __force_merge() function. >> >> The question is rather why does that function exists in the first place? The >> operation doesn't look logical to me. >> >> For drm_buddy_reset_clear() and drm_buddy_fini() we should use >> rbtree_postorder_for_each_entry_safe() instead. >> >> And during normal allocation __force_merge() should never be used. > In normal allocation, the force_merge() function is used when no free blocks > of the requested order are available. In such cases, > smaller blocks must be merged on demand to satisfy the allocation. Mainly, > this does not involve traversing the entire tree to > merge all blocks, but only merging as needed. For example, if the requested > order is 6, and the minimum order is 5, drm_buddy_alloc_blocks() > will first attempt to allocate an order-6 block. If none are available, it > will try to allocate two order-5 blocks. If those are also unavailable, it > will > invoke force_merge() to merge lower order blocks (4, 3, 2, 1, 0) in order to > coalesce into a higher-order block of order 5.
Yeah and exactly that is what should never be necessary in the first place. The idea of a buddy allocator is that blocks are merged when they are freed and not on demand. The only use case for the force_merge() I can see is when cleared blocks are merged with non-cleared ones, but that is orthogonal to the discussion here. > > In drm_buddy_fini(), force_merge() is called to ensure all blocks are merged > before tearing down the allocator. This guarantees that all > mm->roots are freed and not held by the driver at shutdown. If any blocks > remain allocated, drm_buddy_fini() will issue a warning. > > In drm_buddy_reset_clear(), which is invoked at device suspend/resume, it is > an ideal place to call force_merge(). This ensures that all > possible blocks are merged before resetting the clear state, thereby reducing > fragmentation and improving allocation efficiency after resume. That's where rbtree_postorder_for_each_entry_safe() should be used. > I tried using this rbtree_postorder_for_each_entry_safe() macro in the > force_merge() and it works, but we also a need a reverse variant > since in normal allocation we dont want to disturb the lower addresses. I don't get what you mean here. Regards, Christian. > > Thanks, > Arun. >> >>> That said; given 'next' will remain an existing node, only the 'pos' >>> node gets removed, rb_next() will still work correctly, even in the face >>> of rebalance. >> Good to know! >> >> Regards, >> Christian. >