Submitted by , posted on 26 January 2005

Image Description, by

I have just finished the construction of a ray tracer running almost completely on the GPU of my GeForce GX 5650 with a NV3X GPU (That is, no shader model 3.0). OpenGL along with RenderTexture[1] is used to interface with the graphics card. Cg is used to control the GPU.

To make a ray tracer on the GPU it needs to fit into the computational model used by the GPU. This can be done by mapping the algorithm to the stream programming model. The mapping used was described and also implemented by Timothy Purcell in his Ph.d. dissertation[2].

The ray tracer reads scenes in my own format, usually exported from 3D Studio MAX using a custom exporter, and builds a uniform grid on the CPU. The grid is then uploaded to the graphics card as textures along with other geometry information and processing is handed over to the graphics processor.

The GPU generate eye rays which are traversed through the grid until they hit a triangle or leave the grid. When a triangle is hit, the hit point is shaded according to material and normal parameters. To keep track on how far each ray is in the pipeline, state vectors are kept for each ray. The state vector, among other things, indicates whether the ray is traversing the grid or is ray to be shaded of checked for intersection with triangles.

The ray tracer was written in a span of two weeks and are therefore simpler than it could have been, but it nevertheless demonstrates the technique explained by Purcell. It would be simple to add levels of recursion, producing a path tracer and it could also be extended with a photon mapper[3].

It is also very slow compared to what the CPU would manage, this is partly due to the lack of multi render targets on my GPU (causing me to make _many_ context switches between texture render targets) and partly due to the fact that no early-z culling technique is used. Speed was never really achieved and the implementation should be looked as something interesting to play with rather than something new (after all I just implemented what was described in [2], which has been done before) and exiting.

The document at gives an extensive overview of the implementation. For others thinking about implementing the algorithm, it might prove useful (or a complete waste of time, what do I know? :) )

\Thomas Mølhave (


Image of the Day Gallery



Copyright 1999-2008 (C) FLIPCODE.COM and/or the original content author(s). All rights reserved.
Please read our Terms, Conditions, and Privacy information.