Re: [hsa 2/10] Modifications to libgomp proper

Jakub Jelinek Tue, 12 Jan 2016 06:23:57 -0800

On Tue, Jan 12, 2016 at 02:46:52PM +0100, Martin Jambor wrote:
> diff --git a/libgomp/task.c b/libgomp/task.c
> index ab5df51..828c1fb 100644
> --- a/libgomp/task.c
> +++ b/libgomp/task.c
> @@ -584,8 +592,34 @@ GOMP_PLUGIN_target_task_completion (void *data)
>        gomp_mutex_unlock (&team->task_lock);
>      }
>    ttask->state = GOMP_TARGET_TASK_FINISHED;
> -  free (ttask->firstprivate_copies);
> -  gomp_target_task_completion (team, task);
> +
> +  if (ttask->devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)


First of all, I'm surprised you've changed
GOMP_PLUGIN_target_task_completion rather than gomp_target_task_completion.
The difference between those two is that the latter is run nost just from
the async event, but also if GOMP_PLUGIN_target_task_completion happens to
run before the gomp_mutex_lock (&team->task_lock); is acquired in the
various spots before
child_task->kind = GOMP_TASK_ASYNC_RUNNING;
The point is if the async completion happens too early for the thread
spawning it to notice, we want to complete it only when the spawning thread
is ready for that.

But looking at GOMP_PLUGIN_target_task_completion, I see we have a bug in
there,
  gomp_mutex_lock (&team->task_lock);
  if (ttask->state == GOMP_TARGET_TASK_READY_TO_RUN)
    {
      ttask->state = GOMP_TARGET_TASK_FINISHED;
      gomp_mutex_unlock (&team->task_lock);
    }
  ttask->state = GOMP_TARGET_TASK_FINISHED;
  gomp_target_task_completion (team, task);
  gomp_mutex_unlock (&team->task_lock);
there was meant to be I think return; after the first unlock, otherwise
it doubly unlocks the same lock, and performs gomp_target_task_completion
without the lock held, which may cause great havoc.

I'll test the obvious change here.

> +    {
> +      free (ttask->firstprivate_copies);
> +      size_t new_tasks
> +     = gomp_task_run_post_handle_depend (task, team);
> +      gomp_task_run_post_remove_parent (task);
> +      gomp_clear_parent (&task->children_queue);
> +      gomp_task_run_post_remove_taskgroup (task);
> +      team->task_count--;
> +      int do_wake = 0;
> +      if (new_tasks)
> +     {
> +       do_wake = team->nthreads - team->task_running_count
> +         - !task->in_tied_task;
> +       if (do_wake > new_tasks)
> +         do_wake = new_tasks;
> +     }
> +      /* Unlike other places, the following will be also run with the 
> task_lock
> +         held, but there is nothing to do about it.  See the comment in
> +         gomp_target_task_completion.  */
> +      gomp_finish_task (task);
> +      free (task);
> +      gomp_team_barrier_set_task_pending (&team->barrier);

This one really looks weird.  I mean, this should be done if we increase the
number of team's tasks, and gomp_task_run_post_handle_depend should do that
if it adds new tasks (IMHO it does), but if new_tasks is 0, then
there is no new task to schedule and therefore it should not be set.

> +      gomp_team_barrier_wake (&team->barrier, do_wake ? do_wake : 1);
> +    }
> +  else
> +    gomp_target_task_completion (team, task);
>    gomp_mutex_unlock (&team->task_lock);
>  }
>  
> @@ -1275,7 +1309,12 @@ gomp_barrier_handle_tasks (gomp_barrier_state_t state)
>         thr->task = task;
>       }
>        else
> -     return;
> +     {
> +       if (team->task_count == 0
> +           && gomp_team_barrier_waiting_for_tasks (&team->barrier))
> +         gomp_team_barrier_done (&team->barrier, state);
> +       return;
> +     }
>        gomp_mutex_lock (&team->task_lock);
>        if (child_task)
>       {

And this hunk looks wrong too.  gomp_team_barrier_done shouldn't be done
outside of the lock held, there is no waking and I don't understand the
rationale for why you think current gomp_barrier_handle_tasks is wrong.

Anyway, if you make the HSA branch work the less efficient way of creating a
task that just frees the firstprivate copies, and post after the merge into
trunk a WIP patch that includes this, plus if there are clear instructions
how to build the HSA stuff on the wiki, my son has a box with AMD Kaveri,
so I'll try to debug it there.

        Jakub

Re: [hsa 2/10] Modifications to libgomp proper

Reply via email to