Harmless Algorithms - Issue 04 - A Hybrid Approach to Visibility by Edward Kmett (19 February 2001) |
Return to The Archives |
Introduction
|
This column will discuss an example of a hybrid visibility algorithm from a few-year-old test engine of mine,
which used approaches from the previous columns. Providing the terminology to give the explanation below was the
main thrust of the original Harmless Algorithms columns, but I quite simply never found the time to finish. My
career has taken me far away from 3d graphics. Well, since a number of people have been nagging me about it and
it has been over a year since my last update, here goes. Hybrid Visibility: |
Goals
|
Octree
|
The first piece is to use an Octree to divide space up into manageable pieces. The scene preprocessor just grabs polygons
and dumps them into Octree buckets, recursing until a maximum depth is reached or a reasonable number of polygons per Octree node (around 50) is reached. Now, I could stop here, clip the polygons to the Octree and traverse my scene using a neighbor traversal of the Octree structure, by inserting checking each of the polygons in the node against a C-Buffer and rendering the ones that pass, then inserting all of the nodes into the C-Buffer before checking the faces of the tree against the c-buffer and traversing outwards. The problem is this does not provide me with intra-node polygon culling (No polygon within the Octree node can occlude any other polygon within the Octree node), and I have no way to achieve front-to-back sorting within the node, and my collision detection times are terrible. This would be fine, if I didn't want to deal with collision or shadow casting point lights. |
BSP Cache
|
So, instead I decide to go with miniature BSPs inside of the Octree nodes, and to really play devil's advocate, I don't even
precompile them. Before you panic, please consider that these individual BSPs have very manageable polygon counts (usually 50
or less), and so the nonlinearities of the BSP compilation process are minimized. 70,0003 polgyons for a 70,000 polygon scene in
a Quake-style engine vs 503 performed 1400 times (2000 allowing for shared polygons) differs by several orders of magnitude.
~350 trillion operations vs. 250 million operations is quite a spread, especially when you consider that the 250 million operation
version can be handled in chunks of 125,000 operations. Please note that these numbers are worst case scenarios. Also note that
this is for only a 70,000 polygon scene. As the scene approaches a million polygons, these numbers become far more ridiculous.
(1e18 vs 2.5e9) The polygon soup is tagged with what Octree nodes they belong to and are accessible through lists from the Octree nodes themselves, so when I need to compile a node into the mini-BSP cache, I just pass in each of the polygons known to be in the node to the BSP compiler. The BSP compiler does not bother balancing the tree because the computational overhead of balancing is worse than the pathological cases an unbalanced tree can induce. The compiler generates clipped polygon fragments which are used for visibility determination, but keeps pointers from the fragments to the original polygons. The miniBSP cache is maintained on a LRU (least recently used) basis. You could gain a global speed-up by removing the cache and precompiling the BSPs, but you would lose the ability to perform gross geometry changes by changing the polygons or large-scale objects that occupy a set of nodes and expiring their LRU cache entries. When performing a front-to-back traversal of the node, you walk the BSP inserting fragments into your visibility system (Either C-Buffer or Beam Tree have their own relative merits, depending on display resolution and if you are using anti-aliasing), but if you insert a fragment you render the whole polygon (skipping it of course if you have already inserted another fragment from this same polygon). Rendering the scene could still be done ignoring the BSP, using the approach above if you wish - and may be considerably faster if the BSP fragments too heavily, but the presence of a front-to-back structure is useful for collision detection and complex lighting. If the BSP is too fragmented then the cache hit of walking the polygons in the Octree node twice to render then insert into the fine occlusion culling algorithm is less than the overhead of walking the structure. The real purpose of the BSP is to provide something to lean on for collision detection. BSP collision detection algorithms are among the fastest in the industry and are very elegant and pleasant to code. Other uses for the BSP are real-time CSG, which is useful if you really want to blow holes in walls. CSG of a convex solid against a BSP is actually a surprisingly straight forward process. The problem is the number of polygons will increase dramatically over time if you leave this unchecked, and so it really needs good in-game support and reasoning to be practical. I should stress that characters are not part of the BSP. Their moving polygon count renders them far too complex to reasonably insert into such a structure. |
Point Lights
|
Now we have reasonable collision detection, and a solid scene traversal which will prevent us from seeing things which aren't
there, with safety valves for their more pathological cases. Let there be light. Note that thus far there hasn't been much of a preprocessing step. Dumping polygons into Octree buckets can be done, worst case, as the engine loads. This is a good thing for rapid prototyping. However, we all know that you can't get away with radiosity under such conditions. So, if you want radiosity, you'll have to preprocess it. On the other hand, this will doom any attempt at dynamic geometry in the areas in which you pre-light, because it will not properly adapt to environmental changes. My answer was to go goth and move to harsher lighting. The Beam Tree algorithm from the earlier columns can be used not only for visibility but to determine the visible polygon fragments in a scene from a given point. These fragments happen to directly correspond to the fragments that would be lit from a light source at that point. What is required is a front-to-back traversable scene structure and a Beam Tree per light. If you provide linear fall off for the light source you can terminate within a known distance of the source light, providing caps on the time the lighting algorithm will consume. Also, you can 'render' lights on a cached basis, because the polygon fragments lit by a given point light do not change from frame to frame unless the geometry is moving. If you limit their use to areas of low dynamic geometry and/or limited CSG, then you'll do fine. |
Rendering
|
Summary
|
My primary goal for this column was to show the ways that a number of innocuous algorithms that in-and-of-themselves are not
very useful can be strung together to achieve goals that none of them can achieve alone. For example, a BSP could not handle
the dynamic geometry or even the sheer size of the world. An Octree could not provide decent collision detection or support
for the shadow casting point lights. A Beam Tree is not always the best algorithm for visible surface determination, so we
provide the C-Buffer as a fallback. Harmless February 3, 2001 You can contact the author at the following address: harmless@mich.com You can also check out his web site at: http://www.kmett.com/ This document is (C) 1999 Edward Kmett and may not be reproduced in any way without explicit permission from the author (Edward Kmett). |
Article Series:
|