Cesar Philippidis
2014-10-03 16:22:52 UTC
There is a reduction bug exposed in the following parallel block.
#pragma acc parallel copy(b[0:3][0:3]) copy(l)
{
#pragma acc loop collapse(2) reduction(+:l)
for (int i = 0; i < 2; i++)
for (int j = 0; j < 2; j++)
if (b[i][j] != 16)
l += 1;
}
Because i and j are local, the collapsed loop is lowered into something
as follows
#pragma acc parallel ...
{
int i
{
int j
{
#pragma acc loop ...
This is a problem because initialize_reduction_data originally expected
a GIMPLE_BIND at the very beginning of a parallel block. I also made the
assumption that collapse would create a single bind.
Looking at this some more, I may need revise initialize_reduction_data
to scan for multiple acc loops within a parallel block. E.g.,
#pragma acc parallel
{
#pragma acc loop reduction (+:foo)
{
}
...
#pragma acc loop reduction (-:bar)
{
}
}
I'll address this issue in a follow up patch.
This patch also includes a runtime test case. I won't apply it to
gomp-4_0-branch just yet. But I wanted to demonstrate a test case
nonetheless. Also, note that part of this patch also changes a comment.
I found some typos in the original comment, so I took the opportunity to
fix them, I hope.
Is this OK for gomp-4_0-branch?
Thanks,
Cesar
#pragma acc parallel copy(b[0:3][0:3]) copy(l)
{
#pragma acc loop collapse(2) reduction(+:l)
for (int i = 0; i < 2; i++)
for (int j = 0; j < 2; j++)
if (b[i][j] != 16)
l += 1;
}
Because i and j are local, the collapsed loop is lowered into something
as follows
#pragma acc parallel ...
{
int i
{
int j
{
#pragma acc loop ...
This is a problem because initialize_reduction_data originally expected
a GIMPLE_BIND at the very beginning of a parallel block. I also made the
assumption that collapse would create a single bind.
Looking at this some more, I may need revise initialize_reduction_data
to scan for multiple acc loops within a parallel block. E.g.,
#pragma acc parallel
{
#pragma acc loop reduction (+:foo)
{
}
...
#pragma acc loop reduction (-:bar)
{
}
}
I'll address this issue in a follow up patch.
This patch also includes a runtime test case. I won't apply it to
gomp-4_0-branch just yet. But I wanted to demonstrate a test case
nonetheless. Also, note that part of this patch also changes a comment.
I found some typos in the original comment, so I took the opportunity to
fix them, I hope.
Is this OK for gomp-4_0-branch?
Thanks,
Cesar