Support forums

Graphics, Gaming, and VR forum streamline counter - warp divergence rate

State Accepted Answer
Locked Locked
Replies 2 replies
Subscribers 137 subscribers
Views 1254 views
Users 0 members are here

Options

How was your experience today?

This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

streamline counter - warp divergence rate

rchakena over 1 year ago

Hello forum,

I am trying to understand the meaning of warp divergence rate metric in streamline for G-715 GPU.

Using the below test case and I was expecting the divergence rate metric to show up around 50% assuming the warp size on the GPU is 16 and local size is set 16, and only 8 threads are executing either if or else block. But the streamline shows Warp divergence of 98% and Full warp rate 100 and Number fragment warps 2 warps. Any insights on why the divergence is close to 100% would be appreciated. (I am launching the test with global size of X=16, Y=1, Z=1)

#version 320 es
layout(std430, binding = 0) buffer OutputBuffer { float data[];};
layout(local_size_x = 16, local_size_y = 1, local_size_z = 1) in;
void main() {
uint threadId = gl_LocalInvocationID.x;
if (threadId > 8u) {
for(int i=0; i<64; i++) {
data[threadId] = data[threadId] + float(threadId) * 2.0;
}
} else {
for(int i=0; i<64; i++) {
data[threadId] = data[threadId] + float(threadId) * 5.0;
}
}
}

Top replies

Peter Harris over 1 year ago in reply to Peter Harris +1 verified

Warp size is 16 wide. If only uses 8, so is divergent. Else uses the other 8, so is divergent. The if and the else contain a lot of instructions due to the loop, so the divergent code dominates the...

Parents

+1 Peter Harris over 1 year ago in reply to Peter Harris

Warp size is 16 wide.

If only uses 8, so is divergent.

Else uses the other 8, so is divergent.

The if and the else contain a lot of instructions due to the loop, so the divergent code dominates the initial 16-wide non-divergent code that tests thread ID. 98% seems right to me.

*EDIT* Note that divergent counter simply counts the number of instruction issues that have any level of divergence. It does not count the amount of divergence in each instruction issue.
Cancel
Up +1 Down

Cancel

Reply

+1 Peter Harris over 1 year ago in reply to Peter Harris

Warp size is 16 wide.

If only uses 8, so is divergent.

Else uses the other 8, so is divergent.

The if and the else contain a lot of instructions due to the loop, so the divergent code dominates the initial 16-wide non-divergent code that tests thread ID. 98% seems right to me.

*EDIT* Note that divergent counter simply counts the number of instruction issues that have any level of divergence. It does not count the amount of divergence in each instruction issue.
Cancel
Up +1 Down

Cancel

Children

No data