Archive for December, 2009

Hardware Geometry Instancing

Tuesday, December 22nd, 2009 by rdb

Recently, I have been working on a little Infinite Terrain demo that explains how to take full advantage of Panda3D’s terrain capabilities. In the process I have been adding various features to Panda3D that would make it easier for me, including many improvements to the Shader Generator. (On a sidenote, I didn’t end up needing any shaders for the terrain.)
Last Monday, I added trees to the terrain. As this is an infinite terrain demo, I needed to add quite an amount of trees. But I found that (even with Panda3D’s flattening capabilities) my GPU quickly let me down after a few thousand trees. So, I realized I needed to add geometry instancing support to Panda3D.
And so I did. It turned out to be quite trivial to implement and only took me an hour or two. The results are quite pleasing! I have managed to render over 100000 trees (and a huge terrain) at the same time at a reasonable framerate! Here’s a screenshot of what it looks like right now:

screenshot-Tue-Dec-22-14-52-54-2009-67

Of course, that WIP scene could use a lot of improvement, but you get my point. And with some proper culling and LOD, I could push the amount of trees even higher.

But doesn’t Panda3D already support instancing?

Currently, Panda3D supports instancing of animated models. That is entirely unrelated to geometry instancing. The existing instancing system only exists to improve performance if you have a lot of animated models, by reducing the amount of vertex displacements that are done by Panda3D’s animation system. Geometry instancing, on the other hand, exists to greatly reduce the amount of data that is passed to the video card. Whether the model is animated or not is irrelevant with the new instancing system.

How does it work?

Before yesterday, if you wanted to create multiple instances of a model, you’d either load the model multiple times or use copyTo. I’ve seen that many people use instanceTo in this case, but that will have no positive effect on performance for static models. The geometry will still be passed many times to the GPU, which is a slow process.
With the new system, you keep just a single copy of the model and call setInstanceCount(n) on it. This means that it will still be passed to the GPU only once, but it will be rendered n times.

You might be wondering, how do I give each node different parameters or a different position? Well, that can be done in the shader. You can access the instance ID in the shader and calculate the position, color, etc. based on that, or simply use a different model projection matrix from an array of transform matrices that you pass to the shader. This allows you to do basically anything. You can send a single sphere to the GPU, passing a 3D texture with a displacement map in each layer, and set the instance count to 1024. That will result in 1024 unique rocks using just one batch call and a very limited amount of uploaded geometry.

This leaves the question of whether this will actually work without use of shaders. The answer is no, I’m afraid. There is an OpenGL extension that allows you to use geometry instancing with the fixed-function pipeline, but very few video cards support it. Because it would be quite complicated to implement that, I decided not to do it.

Note that this is only supported in OpenGL so far. Maybe someone will add support for this to Panda’s DirectX side someday.