In this thread I’m going to explain some methods to optimize your maps or to teach you new ways on how to create "fake effects" which are used in
modern game engines and are not supported by the Pteroengine II. I don't want to explain how to use 3ds max to get these results, these are only suggestions.
1. Fight the tile effect 2. Ocluder 3. Alpha Channels 4. World lighting (daytime) 5. DrawCalls 6. Sectors 7. "Dynamic" clouds 8. The power of lightmaps 9. "Glowing" lights 10. LOD's
This thread is under construction. Because of I have to work i'm going to update it in the next time. Because of my bad english I will correct it if there are mistakes.
Unfortunately you always have an ugly tile effect in games if you don’t do something against it. The problem is that you can see the repeating details.
But you are able to hide these a little bit by blending over another texture with less tiling. You can see the result on the following pictures:
In this example there is only a small part of a diamond plate. If we tile it as strong as recommended, the tile effect would look ugly.
We only give the diamond plate a strong tiling and we blend it over e.g. with a rusty metal texture.
The same case. The concrete texture (left) would look ugly when we tile it a lot, so we give it e.g. a tile amount of 1 and then blend over a detail texture.
Sometimes it is recommended to use more than one texture of the same sort to avoid repeating objects.
In the following example is a low detailed texture, three dirt textures and a graffiti as an overlay.
There are three small bunkers and we want to add a little bit of variation, so we overlay them with different dirt textures to avoid boring repetition.
In this section i try to explain how to deal with methods increasing the performance. There are three ways to increase your performance by blending / hiding objects.
One of these is the usage of ocluder. Ocluder can have every desired form, but it is recommended to use simple planes or walls. Ocluder hide every object that is
behind it if the right properties are set. You do this so the object is occluded with the properties set to either particled or boxed.
If you choose boxed, a box
fitting the objects size is build around it. This means if the whole box is behind the ocluder, it is not drawn. The performance of this kind of occluding is worse than when it is set to particled, because the engine has to calculate a box instead of a simple face.
If you set the properties to particled, only one entity that is always faced to the camera is created. It has a better performance and is faster calculated, but it is not very precise. The face always has the same size, so when you see your object e.g. from the front -where the biggest part is- it is ocluded well, but when you see it from a thin side, the occluding is delayed.
These objects (apartments) should have ocluded parts:
In Pteromat you are able to set different alpha-channels for your material. Below I will explain them, how they work, their advantages and disadvantages.
#0 - transparent, zbufwrite, sort
-alpha channel is fully used ( complete 8-bit gradient )
-requires texture alpha-channel
advantage(s):
- this is the easiest and best looking way to create transparent objects
- alphasource of this texture is easy to control
disadvantage(s):
- you can only use this kind when no materials set with this alpha-channel overlap or are overlayed from close to a far distance.
If this happens, all parts of the texture that have grey ( not completely black or white ) colour in the alpha-channel of the texture aren't rendered and you can see
the mapbackground ( sky-object set to background or the the editor world [black].
-this kind has a comparatively big impact on the performance. Don't use too much of them.
-alpha channel uses only completely black or white colour( 1-bit gradient [white/black] )
-requires texture alpha-channel
-if the alpha-channel has 8-bit colours, the grey colours are interpreted as black or white, depending on their brightness.
-used for almost everything: vegetation, metal grates...
advantage(s):
-seems to have no errors with overlapping and other stuff
-1-bit alpha-channel of the texture is easy to control
-alpha channel is fully used ( complete 8-bit gradient )
-requires texture alpha-channel
advantage(s):
- this is the easiest and best looking way to create transparent objects
- alphasource of this texture is easy to control
disadvantage(s):
-only a few of these can overlap without any error. If alot are overlapping the same 'background' error as the #0 - transparent .... background error happens
-this method is special. At the moment I don't know where I can use them.
-this kind is the same as #0 - transparent ....., but the alphachannel is only used, if a material with #3 - transparent, zbufwrite, nosort, 1-bit alpha ( the same alpha-channel )
or #1 - transparent, zbufwrite, sort, 1-bit alpha is behind it. Then only the material with one of these alpha-channels set is visible.
-requires texture alpha-channel
#4 - translucent - add with background, no_zbufwrite,sort
-this shader works ( for mapping ) like the 'multitexture-channel', but the function is reversed. The brighter the image the more you can see. Black is not rendered, white is fully visible.
-this alpha-channel adds fake emissive. The texture seems to glow.
-this kind is used for eg. fire, light, godrays...
advantage(s):
-no alpha-channel is required
-almost no impact on the performance
disadvantage(s):
-only a few of these can overlap without any error. If alot are overlapping, the same 'background' error as the #0 - transparent .... error happens.
This is a screenshot of the map 'Halong Port'. As you can see it is not very "realistic" in it's color. Everything is light-dark grey and it looks a little bit boring.
The terrain always receives a blue tint from the sky. On the most maps the real ambient light, means the sky, is not observed.
So the result doesn't look good.
Trying to minimize draw-calls is generally a CPU-side optimization.
Every GL/D3D function call has a cost. Draw functions have the highest cost, as they actually collect all the changed states, bound resources, validate everything, build native GPU commands, push those commands into a command-buffer, and possibly flush that buffer through to the GPU.
If you have too many draw-calls, you can end up in a situation where the CPU's milliseconds-per-frame value is actually higher than the GPU's value, which is rediculous!
Mantle/Metal/GLNext/D3D12 exist to solve this problem, and reduce the CPU cost of draw-calls.
On the GPU side of things, the number of state-changes becomes an issue. The GPU always wants to work on large amounts of data at a time -- thousands of triangles, thousands of pixels, etc...
Ideally, the GPU will actually try to merge multiple successive draw-calls into a single "job"!
Certain state-changes cause the GPU to have to take a small break in-between draw calls to adjust to the new state. The details depend on the GPU -- on some it might be any state change, on others resource bindings might be free, etc... there's some general hints / rules of thumb about what tends to be expensive though...
If a draw-call contains a lot of data (e.g. thousands of pixels), then often this small pauses do not matter, because the GPU can perform the state adjustment in the background while it is still drawing the pixels from the previous draw-call.
However, it becomes a huge problem if your draw-calls do not contain much work. I had a project a few years ago where we had about 100 draw-calls that each only drew about 40 pixels each. We had access to a vendor-specific profiling tool that showed us that each of those draw-calls was costing the same amount of time as one that would've draw 400 pixels (10x more than they should!!), simply because we were changing states in between each draw. We developed the guideline (for that specific GPU) that every draw-call should cover at least 400 pixels in order to avoid the state-change penalty.
On newer GPUs, they can be preparing multiple draw-call's states at the same time, so these penalties only appear when you submit, say, 8 tiny draw calls with different states, in a row.
Still, it's always best practice to try and sort/group your geometry to reduce state-changes to keep the GPU happy... and as a result, you'll probably end up with less D3D/GL function calls on the CPU side, and possibly even less draw-calls for the CPU as well!
[edit]
One small detail that doesn't happen much in practice -- every command sent by the CPU (state change, draw, etc) must be processed by the GPU command processor (sometimes called a front-end). This bit of hardware decodes the commands and controls the GPU. Usually there's so much work for a GPU to do (e.g. one command might result in thousands or millions of pixels being drawn) that the speed of command processing doesn't matter. Usually if you're generating so many commands that you're bottlenecked by the CP, then you're already going to be bottlenecked by your CPU costs anyway!! However, apparently on the next-gen APIs (e.g. Mantle), the CPU side cost of draw-calls has become so cheap that it's possible for you to become bottlenecked by the GPU's CP. In that situation you'd want to follow the traditional advice of minimizing draw-calls again biggrin.png
[edit #2]
The advice from about 5 years ago was that if you had 2500 draw-calls per frame, then you'd be able to run at 30Hz as long as all you did was render things.
i.e. 2500 draw-calls would take ~33ms of CPU time... which means you've got no time left over to run gameplay or physics or AI!
So back then, you'd usually aim for under 1000 draw-calls, so that you have time left over for the rest of the game, and can still hit 30Hz.
At the moment, D3D11 is much faster than D3D9/GL2 were at that time, plus CPUs are faster, so you can do well above 1000 draws per frame now... but can't go crazy.
On D3D12/Mantle/GLNext and game consoles, it's possible to go as high as 10k or 100k draws per frame.
On mobile devices with GLES though, you're often told to try and stay under 100 draw-calls per frame!