On 10.09.25 14:37, Arunpravin Paneer Selvam wrote:
> Hi Christian,
> 
> On 9/9/2025 9:55 PM, Christian König wrote:
>> On 09.09.25 16:05, Peter Zijlstra wrote:
>>> On Tue, Sep 09, 2025 at 02:04:30PM +0200, Christian König wrote:
>>>> Hi Arun,
>>>>
>>>> On 09.09.25 11:56, Arunpravin Paneer Selvam wrote:
>>>> [SNIP]
>>>>
>>>>> +/**
>>>>> + * rbtree_for_each_entry_safe - iterate in-order over rb_root safe 
>>>>> against removal
>>>>> + *
>>>>> + * @pos:    the 'type *' to use as a loop cursor
>>>>> + * @n:        another 'type *' to use as temporary storage
>>>>> + * @root:    'rb_root *' of the rbtree
>>>>> + * @member:    the name of the rb_node field within 'type'
>>>>> + */
>>>>> +#define rbtree_for_each_entry_safe(pos, n, root, member) \
>>>>> +    for ((pos) = rb_entry_safe(rb_first(root), typeof(*(pos)), member), \
>>>>> +         (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), 
>>>>> typeof(*(pos)), member) : NULL; \
>>>>> +         (pos); \
>>>>> +         (pos) = (n), \
>>>>> +         (n) = (pos) ? rb_entry_safe(rb_next(&(pos)->member), 
>>>>> typeof(*(pos)), member) : NULL)
>>>> As far as I know exactly that operation does not work on an R/B tree.
>>>>
>>>> See the _safe() variants of the for_each_ macros are usually used to 
>>>> iterate over a container while being able to remove entries.
>>>>
>>>> But because of the potential re-balance storing just the next entry is not 
>>>> sufficient for an R/B tree to do that as far as I know.
>>>>
>>>> Please explain how exactly you want to use this macro.
> Thanks for the pointer, yes, this will not work on RB tree. We need a reverse 
> safe variant for use in the force_merge() function similar to the
> list_for_each_entry_safe_reverse() macro in list.h. The reason is that in 
> force_merge(), we remove the block from the free tree before invoking
> drm_buddy_free(), which merges and frees buddy blocks to form a larger block.
>>> So I don't much like these iterators; I've said so before. Either we
>>> should introduce a properly threaded rb-tree (where the NULL child
>>> pointers encode a linked list), or simply keep a list_head next to the
>>> rb_node and use that.
>> I agree, something is clearly fishy here.
>>
>>> The rb_{next,prev}() things are O(ln n), in the worst case they do a
>>> full traversal up the tree and a full traversal down the other branch.
>> Yeah from the logic that is exactly what is supposed to happen in the 
>> __force_merge() function.
>>
>> The question is rather why does that function exists in the first place? The 
>> operation doesn't look logical to me.
>>
>> For drm_buddy_reset_clear() and drm_buddy_fini() we should use 
>> rbtree_postorder_for_each_entry_safe() instead.
>>
>> And during normal allocation __force_merge() should never be used.
> In normal allocation, the force_merge() function is used when no free blocks 
> of the requested order are available. In such cases,
> smaller blocks must be merged on demand to satisfy the allocation. Mainly, 
> this does not involve traversing the entire tree to
> merge all blocks, but only merging as needed. For example, if the requested 
> order is 6, and the minimum order is 5, drm_buddy_alloc_blocks()
> will first attempt to allocate an order-6 block. If none are available, it 
> will try to allocate two order-5 blocks. If those are also unavailable, it 
> will
> invoke force_merge() to merge lower order blocks (4, 3, 2, 1, 0) in order to 
> coalesce into a higher-order block of order 5.

Yeah and exactly that is what should never be necessary in the first place.

The idea of a buddy allocator is that blocks are merged when they are freed and 
not on demand.

The only use case for the force_merge() I can see is when cleared blocks are 
merged with non-cleared ones, but that is orthogonal to the discussion here.

> 
> In drm_buddy_fini(), force_merge() is called to ensure all blocks are merged 
> before tearing down the allocator. This guarantees that all
> mm->roots are freed and not held by the driver at shutdown. If any blocks 
> remain allocated, drm_buddy_fini() will issue a warning.
> 
> In drm_buddy_reset_clear(), which is invoked at device suspend/resume, it is 
> an ideal place to call force_merge(). This ensures that all
> possible blocks are merged before resetting the clear state, thereby reducing 
> fragmentation and improving allocation efficiency after resume.

That's where rbtree_postorder_for_each_entry_safe() should be used.

> I tried using this rbtree_postorder_for_each_entry_safe() macro in the 
> force_merge() and it works, but we also a need a reverse variant
> since in normal allocation we dont want to disturb the lower addresses.

I don't get what you mean here.

Regards,
Christian.

> 
> Thanks,
> Arun.
>>
>>> That said; given 'next' will remain an existing node, only the 'pos'
>>> node gets removed, rb_next() will still work correctly, even in the face
>>> of rebalance.
>> Good to know!
>>
>> Regards,
>> Christian.
> 

Reply via email to