|See what's going on with flipcode!|
Geometry Skinning / Blending and Vertex Lighting - Using Programmable Vertex Shaders and DirectX 8.0
by (21 September 2001)
|Return to The Archives|
|This article is intended for readers who are already familiar with DirectX 7.0 and want to move on to DirectX 8. It is assumed that the reader has knowledge about the graphics pipeline, simple matrix math, flexible vertex formats and other aspects of DirectX programming.|
Introduction To DirectX 8.0 Pipeline
Before getting along with the DirectX 8 pipeline, let's go back in time and take a quick look at the DirectX 7 pipeline. |
Features of DirectX 7.0 and its predecessor's pipeline:
Features of DirectX 8.0 programmable pipeline: Vertex shader is that part of the graphics pipeline which gives the user the power to do his own custom transformation and lighting without sacrificing any speed. It takes an input (flexible) vertex which has properties like position, normal, color, texture co-ordinates and outputs the vertex in the CLIP space.
What can you do with the vertex shader?
What you cannot do with a vertex shader?
Architecture Of Vertex Shaders
The numbers of registers are not going to remain the same. In fact the table given above holds good only for the current generation of video cards like GEForce3. You can query the number of available vertex shader constant registers by using the function GetDeviceCaps.
Each register can hold upto 4 floats, i.e, 128 bits.
Visualize each register as having x, y, z and w components.
Registers and their names:
GPU uses SIMD (Single Instruction Multiple Data) technology.
All operations are done on the GPU when using HARDWARE_VERTEXPROCESSING. If using SOFTWARE_VERTEXPROCESSING, then the GPU is simulated on the CPU by DirectX using processor specific optimizations. Obviously, it's not as fast as the GPU. If you are debugging your vertex shader code, then, it is a must that you use software vertex processing.
What happens to the vertex data after it leaves the vertex shader?
About The Vertex Shader Instruction Set And Assembler
Click on any of the instructions to see their details.
Click on any of the instructions to see their details.
nop: No operation
Limitations of some instructions:
Some instructions have limitations on how the registers can be used.
Ex. add r0 , c4, c3
will give an error that the maximum number of constant registers add can read is one.
But this line will not give an error: add r0 , r4, r3
Take advantage of free negation, swizzle, and masks:
Vertex shader supports free negation. Ex. add r0, -c4, c2
The above instruction means ->
r0.x = -c4.x + c2.x
r0.y = -c4.y + c2.y
r0.z = -c4.z + c2.z
r0.w = -c4.w + c2.w
Do not use:
mov r2, -c4
add a0, r2, c2
You can swizzle the components of the register.
Ex. add r0 , c4.yxzy, r3.xywz
The above instruction means ->
r0.x = c4.y + r3.x
r0.y = c4.x + r3.y
r0.z = c4.z + r3.w
r0.w = c4.y + r3.z
Destination register can mask which components are written to.
R1 . . . . write all components
R1. x . . . . write only x component
R1. xw . . . . write only x, w components
Vertex Shader Assembler
For assembling a vertex shader, you will need the vertex shader assembly source and its declaration. The declaration describes the (flexible) vertex format.
You can create your vertex shader in two ways (that I know).
1. Write the assembler as a string with each complete instruction terminated by a new line character. Use the standard DirectX function to assemble the shader and to create the shader.
Ex, Click here.
This is pretty rigid and does not support macros.
2. Write the assembler code in a text file and assemble using the NVidiaฎ assembler (available on the NVidia web site).
Ex, Click here.
As you see, this is very flexible and supports macros.
You can assemble your vertex shader code by using nvasm.exe.
Its usage format is nvasm.exe x Input.vsi Input.vso
Tips on writing good Vertex shader code:
Introduction To DirectX Lighting Model
In this section, we shall see how surfaces are shaded based on position,
orientation, and characteristics of the surfaces and the light sources
First, let's begin by discussing different kinds of light sources (of course, the only source of light for us geeks is the monitor) that are in practice today.
This is in increasing order of computation complexity.
We will be discussing only point lights and the diffuse lighting model in this document.
According to Lambert's law, the amount of light seen by the viewer is independent of the viewer's direction and is proportional only to cos q, the angle of incidence of light.
Surfaces that exhibit diffuse reflection or Lambertian reflection appear equally bright from all viewing angles because they reflect light with equal intensity in all directions.
For a given surface, the brightness depends only on the angle q between the direction to the light source L and the surface normal N.
Since, point lights have position also, the distance from the surface to the light also affects the intensity. This is known as distance attenuation.
Light Attenuation for point lights
Typically, the energy from the point light that hits a surface diminishes as the distance increases.
Practically, inverse linear falloff doesn't look good. The inverse squared falloff looks better.
In practice, this also doesn't look good. If light is far away, then its intensity is too low. But, when it is close up to the surface, intensity becomes large leading to visual artifacts.
A decent attenuation equation which can be used to get wider range of effects is:
In the above equation, A0, A1 and A2 are known as the attenuation constants.
As d goes to infinity, attenuation becomes very small but not zero.
By varying the three constants A0, A1, A2, interesting effects can be achieved.
If the light is too close to the surface, then, attenuation becomes very large, making the light too bright. To avoid this, set A0 to 1 or greater.
All light colors are represented in ranges from (0.0 to 1.0), where 0 = least intensity and 1 is the maximum intensity.
Every surface has a material property. Of course, the material property can be global to all the surfaces. There are different kinds of material properties like diffuse, specular, ambient, etc. The ambient and diffuse material properties determine how much light the surface reflects. Ex., If a surface has a diffuse material property of (1,0,0) corresponding to the red, green and blue components of the light respectively, then this surface reflects only the red component whereas the green and blue components will be absorbed.
Even with the attenuation factor blending into our lighting equation, still, the maximum amount of light a surface receives is (N.L).
The final diffuse lighting equation for becomes:
Vertex lighting V/S Polygonal lighting
Polygonal lighting: Light intensity is calculated for each surface / polygon.
Vertex lighting: Light intensity is calculated for every vertex. The intensities will be interpolated linearly across the polygon by the Gouraud shader.
Let's take some time off to take a quick look at the different kinds of
animation systems in practice today.
Let's take a look at what we discussed above, in figures.
The skeletal animation system looks perfect doesn't it?
Well, nothing is perfect in 3D Games. Skeletal animation comes with its own set of problems.
Let's study about the drawbacks of skeletal animation system with the help of some images.
The following image shows the relationship of the skeleton / bone with its mesh. The semi-transparent object is the mesh. As you see, they are tightly coupled. When a bone moves, the vertices attached to it also move.
The following image shows the same relationship as explained above. The only difference is that the mesh is shown in wire frame.
The green line divides the vertices between the first and second bone. Vertices are represented by a small "+" sign. Vertices on the right are marked red indicating that they belong to the second bone, where as the vertices on the left are marked white indicating that they belong to the first bone.
The following figure highlights the main drawback of the skeletal animation system.
The following figure shows how the "stiffness" problem can be solved. A quantum leap isn't it? No dinner comes free. Likewise, vertex blending is not free. It adds to computational cost.
Animation with vertex blending:
Animation without vertex blending:
Everything is moving smoothly with vertex blending. Let's move to the rough part - the math behind vertex blending. Surprisingly, the math is very simple, probably too simple.
The generic blending formula:
vBlend = output vertex.
Vn = nth vertex.
Wn = nth vertex's weight.
For two, three and four weighted matrices, the above formula becomes:
Consider an example:
Click here to look at the C code.
Typically, the vertex structure for a blending vertex looks like this:
How can transformation, lighting and blending be achieved using the vertex shader?
Till now we have only talked about vertex shaders and lighting models. Let's take a look at how they actually work.
Before we start off solving the problem, we need to look at the resources that we need. I have decided to use a tesselated plane as my surface. And I have a texture that I can apply on the plane. We shall set up the plane, keeping in mind, that we also have to blend (skin) later.
We do not want to come back to setting up the plane again. So we will first finish the complete initialization (position, weights, indices and texture co-ordinates) of the plane and then deal with the problem one by one. About the lights, I am using two positional (point) lights. I am assuming that the surface material has a diffuse reflection property of (1,1,1). i.e., the surface reflects all three colors with the same intensity.
See Listing 1 for all the global variables, macros and structure declaration.
The plane itself is tesselated to a certain level. This is controlled by the two macros
Of course, the plane has a certain width and height. It is controlled by the following macros
The plane is actually built in the way as described in the figure below. The pivot is at the origin for both parts of the plane. If the pivot is not at the origin, then some extra work has to be done before rotating the planes.
See Listing 2 which has the source code for the above explaination. You will find some utility functions like InitVertexBuffer and LoadTexture in the Sources Zip file.
Look at the code which loads the vertex shader constants. I am loading the following constants:
1.4.0f, 22.0f, 1.0f, 0.0f on to C0
2.Four zeroes into C1.
3.Transposed world matrix from C14 to C18.
4.Transposed (View x Projection) matrix from C18 to C21.
5.Light 0 Position into C2.
6.Light 0 Attenuation into C3.
7.Light 0 Color into C4.
8.Light 1 Position into C5.
9.Light 1 Attenuation into C6.
10.Light 1 Color into C7.
11.Transposed rotation matrix for the left plane from C22 to C25.
12.Transposed rotation matrix for the right plane from C26 to C29. Why so many constants are loaded will be explained as and when they are required.
Now we have the plane data ready to be displayed. Let's look at the code which sets up the vertex shader constants and calls drawpprimitive.
The matrix data should be transposed before loading into the vertex shader. This is because, the vertex shader operates on row basis and our DirectX matrices operate on column basis.
Note that only the final matrix should be transposed. I.e, transpose the result of (View x Projection) matrix instead of transposing view and projection matrices seperately and multiplying them (which is wrong). This is explained below.
dp4 is used to multiply the vertex position with the world matrix.
dp4 operates only components of a single register.
It does on operate on C14.x, C15.x, C16.x and C17.x simultaneously. But it operates on C14.x, C14.y, C14.z, C14.w simultaneously. Hence we need to transpose the matrix.
Now it becomes,
Also note that (View x Projection) matrix is per scene, not per vertex. Hence they are multiplied on the CPU just once per scene (GameLoop). The world (transposed) matrix is loaded seperately, since we need this for lighting calculations.
Since I am using NVidiaฎ assembler, I can use macros. I have defined some macros for loading some common constants.
See Listing 3 for the common constants declaration.
1. Transforming the vertex:
First, let's start off by just transforming the plane vertices into clip space and setting their texture co-ordinates.
For our first solution, we need only two constants, i.e world matrix and the (view x projection) matrices (both transposed). You can ignore the loading of other constants for now. See Listing 4, which has the vertex shader to do just this.
This is the simplest of the shaders. I'm doing just this:
2. Lighting the vertex:
This involves some extra work for lighting the vertices. For this, we need the light properties which are loaded into the appropriate vertex shader constant registers.
See Listing 5, which has the vertex shader to do just this.
I have used a macro NORMALIZE to normalize a vector. It accepts two vectors as parameters. This shader does the following things:
Lighting is calculated in the following manner inside the shader:
R0 has the transformed vertex normal.
R9 has the vertex position in world co-ordinates.
C2 has light 0's position.
C3 has light 0's attenuation.
C4 has light 0's color.
First, calculate the vector between the vertex in it's world position and the light position.
Direction vector (R10) = C2 R9
Normalize R10 such that R11.xyz holds the normalized vector and R10.w holds the squared distance (d*d) between the light and the vertex.
R11.w = linear distance (d) between the light and the vertex.
Find the cosine of the angle between the vertex normal and the newly calculated vertex to light normal.
Setup the attenuation equation in R4.
Calculate fatt in R5.x
Similarly calculate the attenuation fatt for the other lights (upto four) and put them in R5.y, R5.z and R5.w respectively.
Put the (N.L) for the other three lights in R6.y, R6.z and R6.w respectively.
Now R5 has attenuation for the four lights respectively. R6 has the intensity for the four lights respectively. However, in this sample, only two lights are being used. Hence, only the x any y components are valid.
Now you can vectorize light calculations (for two lights) in this way.
3. Vertex Blending:
Lets take a look at how blending can be achieved on the vertex shader.
Click here to peek into the vertex shader code.
The following three lines from the vertex shader code determine the absolute offset into the matrix array which is stored as constants.
r0 will hold the absolute offsets into the four blending weights. Of course, here we are using only two. The rest of the code does implement the blending equation.
4. Mesh without blending:
This vertex shader just transforms the plane by applying their respective rotations. It neither blends nor lights it.
Click here to look at the code.
5. Blending, with light:
This vertex shader blends the mesh and also lights it. For lighting, even the vertex normals have to be blended.
Click here to look at the code.
Download the demo and corresponding tutorial source code packages here:
Notes on using the PlaneBlend.exe.
"A" to go forward.
"Z" to go back.
"Up Arrow" and "Down Arrow" to pitch.
"Left Arrow" and "Right Arrow" to yaw.
F1 and F2 to rotate the left plane on Y axis.
F3 and F4 to rotate the right plane on Y axis.
Press one of the following keys to change the shader.
1 = Basic transformation. (The plane rotations do not affect this)
2 = Basic transformation with lighting. (The plane rotations do not affect this)
3 = Vertex blending with no light.
4 = No vertex blending and no light.
5 = Vertex blending with light.
The static sphere kind of an object that you see is the light source. The light object takes (only) the color from the light source itself.
Choose [Properties...] from "Light" menu to change light settings. You can use [Ctrl+L] shortcut key for this.
Choose [Change display mode...] from "Display Options" menu to change display settings. You can use [Ctrl+D] shortcut key to do this.
Choose [Light...] from "Display Options" menu to toggle light on the world. This option will be saved when you exit the application.
Changing Light properties: Press [Ctrl+L] to invoke the light menu.
In full screen mode, sometimes the dialog box is not displayed and the action stops. Press escape once and try again.
You have full control over the light properties.
You can select a light by using the mouse. Move the camera until the light object (sphere) is in view. Click and hold the left mouse button down and form a rectangle around the light source.
When selected, the light source appears in wireframe. You can also see three lines indicating the three axis. Red line is for X-axis, Green line is for Y-axis and Blue line for Z-axis.
When a light is selected and if you select "Light Properties", then, the properties for the currently selected light are shown.
When selected, you can move the light by pressing the following keys on your keyboard. Make sure NUMLOCK is on.
Numpad4 and Numpad6 to move along X-axis.
Numpad8 and Numpad2 to move along Y-axis.
Numpad7 and Numpad1 to move along Z-axis.
The light properties are standard DirectX light properties like attenuation, position and color.
All lights are positional (point) in nature.
What does the future hold for 3D games with the power of the current and future generation graphics cards and features like programmable pipe line?
The introduction of DirectX 8 and graphics chipsets like GEForce3 has been like an oasis in the desert for game developers. Finally we have something that is so exciting, powerful and mind blowing that we can start thinking (and making) games that are movie like. What can you expect from future games using these technologies?
GPU is exciting isn't it? But, sometimes even the GPU might get clogged since we are off loading all the calculations from the CPU and loading them onto GPU. It's not a very good idea to let the GPU do everything. So, it's up to the game designers to strike a right balance between the CPU and GPU.
Now, the CPU has become the bottle neck as far as speed and performance is concerned. No CPU can match the speed and performance of the GPU. It's a good idea to get the right balance on the hardware front also. A Pentium I MMX 300MHz and a TNT graphics card combination is much better balanced than the same CPU with a GEForce3.
I would like to thank Richard Huddy of NVidia whose articles on vertex shaders inspired me to compile this article. Some diagrams/images w.r.t vertex shaders are originally from his article.
Some Information about the author:
Keshav is a programmer at Dhruva Interactive, India's pioneer in games development. He was part of the team which ported the Infogrames title Mission: Impossible to the PC. He has been an integral part of Dhruva's in-house game engine R&D efforts and has worked extensively on the engine's Character System. He is currently researching multiplayer engine components. You can see his work at the Gallery at www.dhruva.com
Prior to joining Dhruva in 1998, Keshav worked on embedded systems using VxWorks.
Keshav can be contacted at firstname.lastname@example.org