On 16/03/2017 21:02, Xu, Anthony wrote:
>>> memory_region_finalize.
>>> Let me know if you think otherwise.
>>
>> Yes, you can replace memory_region_del_subregion in
>> memory_region_finalize
>> with special code that does
>>
>> assert(!mr->enabled);
>> assert(subregion->container == mr);
>> subregion->container = NULL;
>> QTAILQ_REMOVE(&mr->subregions, subregion, subregions_link);
>> memory_region_unref(subregion);
>>
>> The last four lines are shared with memory_region_del_subregion, so please
>> factor them in a new function, for example
>> memory_region_del_subregion_internal.
>
> After adding synchronize_rcu, I saw an infinite recursive call,
> mem_commit-> memory_region_finalize-> mem_commit->
> memory_region_finalize-> ......
> it caused a segment fault, because 8M stack space is used up, and found when
> memory_region_finalize is called, memory_region_transaction_depth is 0 and
> memory_region_update_pending is true. That's not normal!
Ok, this is a bug. This would fix it:
> Please review below patch
>
> diff --git a/memory.c b/memory.c
> index 64b0a60..4c95aaf 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -906,12 +906,6 @@ void memory_region_transaction_begin(void)
> ++memory_region_transaction_depth;
> }
>
> -static void memory_region_clear_pending(void)
> -{
> - memory_region_update_pending = false;
> - ioeventfd_update_pending = false;
> -}
> -
> void memory_region_transaction_commit(void)
> {
> AddressSpace *as;
> @@ -927,14 +921,14 @@ void memory_region_transaction_commit(void)
> QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) {
> address_space_update_topology(as);
> }
> -
> + memory_region_update_pending = false;
> MEMORY_LISTENER_CALL_GLOBAL(commit, Forward);
> } else if (ioeventfd_update_pending) {
> QTAILQ_FOREACH(as, &address_spaces, address_spaces_link) {
> address_space_update_ioeventfds(as);
> }
> + ioeventfd_update_pending = false;
> }
> - memory_region_clear_pending();
> }
> }
So please send it to the list with Signed-off-by line.
> The thing is, seems both address_space_translate and
> address_space_dispatch_free
> are called under the global lock. When synchronize_rcu is called, no other
> threads
> are in RCU critical section.
No, not necessarily. address_space_write for example is called outside
the global lock by KVM and it calls address_space_translate.
> Seems RCU is not that useful for address space.
I suggest that you study the code more closely... there is this in
kvm-all.c:
DPRINTF("handle_mmio\n");
/* Called outside BQL */
address_space_rw(&address_space_memory,
run->mmio.phys_addr, attrs,
run->mmio.data,
run->mmio.len,
run->mmio.is_write);
and adding a simple assert() would have quickly disproved your theory.
> After adding synchronize_rcu, we noticed a RCU dead loop. synchronize_rcu is
> called
> inside RCU critical section. It happened when guest OS programmed the PCI bar.
> The call trace is like,
> address_space_write-> pci_host_config_write_common ->
> memory_region_transaction_commit ->mem_commit-> synchronize_rcu
> pci_host_config_write_common is called inside RCU critical section.
>
> The address_space_write change fixed this issue.
It's not a fix if the code is not thread-safe anymore! But I think you
have the answer now as to why you cannot use synchronize_rcu.
Paolo