Yes I know that The King powered Chessmaster. I never worked for a big game company so can't say. Regarding 3rd world: It too happend in the company I work for, some people were fired because significant amount of work was offloaded to a subsidiary in India because it's much cheaper so I guess this is quite common.diep wrote: Oh you could ask Johan de Koning, didn't he work a few years for ubisoft?
You realize they first have you sign 20 secrecy contracts before you can even deliver for $100 something? In some cases they just talk talk talk and still don't give YOU the contract job but throw it to the cheapest 3d world nation.
Ok Vincent, if we neglect the fact that GPU can't do everything, you still have to put the objects somewhere. So assuming the vertex buffers are stored/cached in GPU RAM and reused. So drawing an object reduces to one or more OGL/DX calls (oversimplified). The assets created by artists simply have to be stored somewhere. The best optimization is of course to not draw objects which you can't see. And that's of course not easy at all. Sure the GPU can lend a hand because it can do occlusion querries, which if i understand properly means representing an object with a few triangles and letting the GPU probe the z-buffer. Of course you can't wait for the query to complete so while it probes you do other stuff. It's tricky but it can be done, exploiting spatial and temporal coherence between consecutive frames.diep wrote: However if you analyze yourself the benchmark results of the big games you will see that MAJORITY of games doesn't scale with the number of cores times Ghz.
So that's the simplest form of proof you can find. And very conclusive evidence.
They are total dependant upon the bandwidth to the RAM of the GPU and the bandwidth of the RAM of the CPU.
You can easily verify this yourself.
Also if you open a book on graphics algorithms you'll figure out that it doesn't need to be like that. In the end it's not steering a trillion pixels. It's steering just a few displays with objects inside.
If you draw things in a simple manner, then you obviously need huge bandwidth. If you do it more clever, it'll work great at a GPU from 10 years ago.
Realize however that they focus upon supporting all features in a manner that it works for you. Optimization is simply not the focus, unlike game tree search.
The luxury the game industry has when producing games is that those gpu's deliver effectively several teraflops each.
However if you use a few displays of 2560 * 1600 or something like that you'll realize that it effectively can do with just a bunch of gflop.
So the rest is inefficiency.
You can also help determining the visibility on the CPU side, doing frustum culling is the simplest way. For indoor games, it's beneficial to use very low poly approximation of the solid structure of the world and portalize it. You can either precompute visibility (which has insane complexity and works well only if the number of portals is small) or you can simply render and clip through the visible portals.
However for outdoor games, the best thing you can do to help visibility on the CPU side is to use occluders. However this doesn't help that much because filtering through an occluder is costly. And if you have two occluders which overlap from the camera point of view it's costly to properly merge them. So the trick is to combine/balance approaches.
Another problem when rendering is fillrate (overdraw). Just render a bunch of translucent fullscreen quads in high resolution and tell me how many you need to get 10 fps, no matter how fast the GPU is.
Shadow volumes also waste fillrate but AFAIK they use shadow maps today anyway. Still the latter requires rendering (parts) of the scene to a render target texture and then doing projective texturing on the world geometry anyway.
So even if you use deferred shading so that shaders are applied to each nontranslucent pixel only once (and you have to do antialiasing manually in that case), you don't render each pixel only once.
And there are lots of particle effects used in today's games. Which means rendering lots of translucent objects. Sparks are no problem but consider a smoke. If you get close then almost each pixels has to be rendered multiple times. There's not much to do about it, except for particle trimming, which means approximating the texture with 2d convex hull. Of course this will save you fillrate a lot but it depends on the shape of the smoke texture. You don't want to render fullscreen quads in that case.
So perhaps it might be worth doing a z-fill pass, because modern GPUs probably use hierarchical z-buffer, which means they can reject some triangles without having to rasterize them and probing each z-buffer value. But I'm not sure about this one.
Another thing to keep in mind is to use as few state switches as possible (changing textures and so on).
Textures are another thing. I believe todays games can have textures larger than the size of the framebuffer. So you definitely want to use texture atlases and pack smaller textures into bigger one if possible. Also in order to reduce bandwidth, texture compression is applied. A very simple S3TC (DXT) can have good quality with only 4 bits per texel or more if alpha channel is needed (compare to 24-bit uncompressed RGB, that itself is a huge save and allows for larger textures). Of course this is lossy compression.
Or you can do the "megatexture" approach where you have a huge texture (32k x 32k or so) to cover the whole geometry. But you have to be able to do realtime streaming and caching in that case.
AFAIK Rage used that and ET:QW used that for terrain. Of course this only works well if the camera doesn't change fast because the fetches are delayed and you can't wait for it to load so it appears blurry until the desired detail is loaded.
So it's not as simple as it seems. For a forward renderer, it's beneficial to render nontranslucent objects sorted front-to-back because it nicely fills z-buffer. But translucent/transparent objects have to be drawn back-to-front.
I can't imagine that big engines would not render in batches or that they would waste bandwidth/fillrate. Simply because the consoles have much worse GPUs than todays highend in PCs. I don't claim that everyone has everything optimized to the max. You can't afford to be inefficient on consoles. And doing PC-only titles would be a suicide. There were cases where assets/texture quality had to be changed for console ports. But I believe that modern engines already support consoles directly.