Adding Extreme Detail To Polygons  Using Voxel Sets As Input Data by (05 March 1999) 
Return to The Archives 
Introduction

Current 3D hardware is capable of drawing huge amounts
of polygons at high refresh rates. Current 3D games
typically show only hundreds of polygons on the screen
simultaneously, and still no one seems to be asking
why today's games apparently aren't capable of showing
the amount of polygons that are possible according to
the hardware vendors. The reason why 3D games use only a small amount of polygons is this: Each and every polygon requires a huge amount of processing. First, the renderer has to decide which polygons to draw. While a good renderer is perfectly able to determine the set of visible polygons, the amount of work per visible polygon is considerable. After the renderer has decided that a polygon is potentially visible, it needs to be clipped, rotated and finally sent to the accelerator to be rasterized. Because of the amount of processing per polygon, detail is usually added by using detailed textures, effectively faking detail. While this gives rather satisfying results, one can't help but wonder why this is satisfying: Sure, the polygon count has gone up, but not by a factor of 100. 
The General Idea

There is however a way to use the power of an
accelerator, without having to do all the work that
is normally done for each individual polygon. How?
Simple: Postpone adding detail until the very last
moment. Use plain, large polygons in the early
stages of the rendering pipeline: Visibility
determination, rotating, clipping. Then subdivide
them just before rasterization, adding relief
(using a bumpmap, for example). Or rather, make
the subdivision process part of the visualization
routines. That way, subdivision can be omitted for
slower cards or software rendering without
changing the entire renderer. So far, I assume I have not really said anything new. In the rest of this article, I will probably not do so either, but I would like to talk over some ideas here. Subdividing polygons this way introduces some interesting problems, but if we can overcome these, we might be able to use extreme polygon counts, and more important: Have extreme detail in our 3D environments. So, please read on. 
Questions And Potentially Incomplete Answers

Q: How do we subdivide a polygon using a bumpmap
and a texture in real time? A: We don't have to. We could simply precalculate a set of polygons for each polygon/texture/ bumpmap combination, and store it as a set of triangle fans. These could be rendered pretty quickly. Q: But that would require us to have lots of triangle fan sets: One for each polygon/ texture/bumpmap combination... A: Use something similar to a lightmap cache for this, and calculate the fans only when needed. Q: So it has to be done in realtime after all. How do we turn a texture/bumpmap combination into a triangle fan? A: Start with a regular grid with the resolution of the bumpmap. Since we're generating triangles, the number of triangles is: Hbump * Vbump * 2 Then, find triangles that are (almost) in the same plane, and merge them. After all, we only need to add polygons for detailed spots on the bumpmap. Q: How about clipping? Clipping thousands of polygons might get slow... A: Absolutely true. Although I think that most of the tiny triangles will not get clipped, there's quite a huge amount of clipping at the edges of the screen. I have no idea how well an accelerator will handle this. A possible optimization is to add a band of pixels around the rendered view, that is cleared just before the buffers are swapped. this band could be used to draw triangles in without clipping them. Triangles that are completely outside the view area are not rendered at all, triangles that are completely onscreen, but partially in the band are clipped by clearing the band, so now only triangles that are partially off screen, and partially in the view need to be clipped. Depending on the average size of a triangle and the size of the band, this might reduce the number of clips considerably. Q: How about culling? Bumps may be still visible, even when a plane is not facing the camera. A: True again. Consider this situation, a wall section with a round button: __________  _  _  / \  /   \_/  \_ ________  front view side viewObviously, when the 'side view' turns away a bit more from the camera, the bump would still be visible. There is a moment at which the entire 'detail polygon' is invisible however. For the moment, it might be best to find out when, and draw all the polygons until then using zbuffering. We could also test all the triangles of course, but that would move the backface culling process to our rasterizer, and that is exactly what we where trying to prevent. Q: But we still need to rotate all those vertices of the tiny triangles... A: True, but we can apply some interesting optimizations here. First, the range of the coordinates is limited: If the bumpmap is 64x64, then we have 64 possible 'x' values (in texture space), 64 'y' values, and 256 'z' values. For 'z' we may as well use a range of 0..31 to get sufficient height levels. We already have the rotated 3D coordinates of the large detail polygon. These coordinates can now be used to quickly determine the 3D coordinates of the small triangles: First create three temporary lookup tables. One for X (containing 64 coordinates, interpolated from vertex 1 to vertex 2), One for Y (containing 64 coordinates, interpolated from vertex 1 to vertex 3), One for Z. The Z table introduces another problem: Our detail polygon is flat. Well, let's make it fat! So, each detail polygon gets an extra set of vertices, at a fixed distance of the original polygon. This distance is the maximum height that can be stored in the bummap. Now we can also interpolate Z in the third table. Determining the rotated vertex for a triangle is now simply a matter of combining the values in the LOT's: For the triangle vertex that is at grid position (U, V) and height (W), the 3D coordinate is:
So, we have now reduced the cost of rotating all the 3D coordinates to creating 3 LOT's that can be interpolated using only additions, plus 9 lookup's and 6 additions per vertex. That's much better than processing all those triangles in our rendering pipeline. I'm sure this can be made much better still, I've even heard someone say that he could do something like this with a SINGLE lookup and TWO additions per vertex... 
Brain Teasers

So where do we go from here? Well, there are
some other interesting issues involved with
this 'algorithm'. Let me summarize some:

Conclusion

I have presented an algorithm that may or may
not be new, and I have shown that theoretically
this could result in much more detailed graphics.
I have also tried to show that this approach
requires a different way of thinking. The keyword
is: 'Postponing'.
I hope that I gave you something to think about.
If I'm completely wrong, or if you have additions,
suggestions, questions, don't hesitate to contact
me. Keep on coding, Jacco Bikker, a.k.a. "THE PHANTOM" 
Article Series:
