On 22 January 2024 21:33:17 CET, Kwok Cheung Yeung <k...@codesourcery.com> wrote: >Hi > >There was a bug in the declare-target-indirect-2.c libgomp testcase (testing >indirect calls in offloaded target regions, spread over multiple >teams/threads) that due to an errant fallthrough in a switch statement >resulted in only one indirect function ever getting called: > >switch (i % 3) > { > case 0: fn_ptr[i] = &foo; // Missing break > case 1: fn_ptr[i] = &bar; // Missing break > case 2: fn_ptr[i] = &baz; > } > >However, when the missing break statements are added, the testcase fails with >an invalid memory access. Upon investigation, this is due to the use of a >splay-tree as the lookup structure for indirect addresses, as the splay-tree >moves frequently accessed elements closer to the root node and so needs >locking when used from multiple threads. However, this would end up partially >serialising all the threads and kill performance. I have switched the lookup >structure from a splay tree to a hashtab instead to avoid locking during >lookup. > >I have also tidied up the initialisation of the lookup table by calling it >only from the first thread of the first team, instead of redundantly calling >it from every thread and only having the first one reached do the >initialisation. This removes the need for locking during initialisation. > >Tested with offloading to NVPTX and GCN with a x86_64 host. Okay for master?
Can you please akso update the comments to talk about hashtab instead of splay? TIA > >Thanks > >Kwok