Vietcong.Info - Discussion Forum: Advanced Mapping Vietcong.Info

Trying to minimize draw-calls is generally a CPU-side optimization.

Every GL/D3D function call has a cost. Draw functions have the highest cost, as they actually collect all the changed states, bound resources, validate everything, build native GPU commands, push those commands into a command-buffer, and possibly flush that buffer through to the GPU.

If you have too many draw-calls, you can end up in a situation where the CPU's milliseconds-per-frame value is actually higher than the GPU's value, which is rediculous!

Mantle/Metal/GLNext/D3D12 exist to solve this problem, and reduce the CPU cost of draw-calls.

On the GPU side of things, the number of state-changes becomes an issue. The GPU always wants to work on large amounts of data at a time -- thousands of triangles, thousands of pixels, etc...

Ideally, the GPU will actually try to merge multiple successive draw-calls into a single "job"!

Certain state-changes cause the GPU to have to take a small break in-between draw calls to adjust to the new state. The details depend on the GPU -- on some it might be any state change, on others resource bindings might be free, etc... there's some general hints / rules of thumb about what tends to be expensive though...

If a draw-call contains a lot of data (e.g. thousands of pixels), then often this small pauses do not matter, because the GPU can perform the state adjustment in the background while it is still drawing the pixels from the previous draw-call.

However, it becomes a huge problem if your draw-calls do not contain much work. I had a project a few years ago where we had about 100 draw-calls that each only drew about 40 pixels each. We had access to a vendor-specific profiling tool that showed us that each of those draw-calls was costing the same amount of time as one that would've draw 400 pixels (10x more than they should!!), simply because we were changing states in between each draw. We developed the guideline (for that specific GPU) that every draw-call should cover at least 400 pixels in order to avoid the state-change penalty.

On newer GPUs, they can be preparing multiple draw-call's states at the same time, so these penalties only appear when you submit, say, 8 tiny draw calls with different states, in a row.

Still, it's always best practice to try and sort/group your geometry to reduce state-changes to keep the GPU happy... and as a result, you'll probably end up with less D3D/GL function calls on the CPU side, and possibly even less draw-calls for the CPU as well!

[edit]

One small detail that doesn't happen much in practice -- every command sent by the CPU (state change, draw, etc) must be processed by the GPU command processor (sometimes called a front-end). This bit of hardware decodes the commands and controls the GPU. Usually there's so much work for a GPU to do (e.g. one command might result in thousands or millions of pixels being drawn) that the speed of command processing doesn't matter. Usually if you're generating so many commands that you're bottlenecked by the CP, then you're already going to be bottlenecked by your CPU costs anyway!! However, apparently on the next-gen APIs (e.g. Mantle), the CPU side cost of draw-calls has become so cheap that it's possible for you to become bottlenecked by the GPU's CP. In that situation you'd want to follow the traditional advice of minimizing draw-calls again biggrin.png

[edit #2]

The advice from about 5 years ago was that if you had 2500 draw-calls per frame, then you'd be able to run at 30Hz as long as all you did was render things.

i.e. 2500 draw-calls would take ~33ms of CPU time... which means you've got no time left over to run gameplay or physics or AI!

So back then, you'd usually aim for under 1000 draw-calls, so that you have time left over for the rest of the game, and can still hit 30Hz.

At the moment, D3D11 is much faster than D3D9/GL2 were at that time, plus CPUs are faster, so you can do well above 1000 draws per frame now... but can't go crazy.

On D3D12/Mantle/GLNext and game consoles, it's possible to go as high as 10k or 100k draws per frame.

On mobile devices with GLES though, you're often told to try and stay under 100 draw-calls per frame!