Skip to content

salaark/CUDA-Rasterizer

 
 

Repository files navigation

CUDA Rasterizer

SSAA

Introduction

My GPU rasterizer produces real-time, interactive renders of complex geometry. In general, involves checking the pixels associated with each triangle's bounding square (transformed to screen space), using barycentric coordinates to determine overlap color and depth for the fragment map, which is then lit using Lamberrt shading. Scene data is loaded from gltf file format and rendered using the pipeline outlined below.

Pipeline

  1. Vertex shading
  2. Primitive assembly with support for triangles read from buffers of index and vertex data
  3. Scanline rasterization
  4. Fragment shading with lighting scheme
  5. Depth buffer for storing and depth testing fragments
  6. Fragment-to-depth-buffer writing (with atomics for race avoidance)

Analysis

SSAA

SSAA FPS

Super-sample anti-aliasing renders a higher resolution image and then samples groups pixels when it reduces it down to output resolution. The technique definitely comes at a performance cost as shown in this chart. The scene used was a duck taking up almost half of the screen. There are 4x as many pixels rendered in SSAAx2. An optimization of the technique was sampling before transferring from fragment to frame buffer, allowing for a smaller frame buffer and less data passed to the "sendImageToPBO" kernel. This may be further optimized by passing less data into the fragment buffer.

SSAA FPS

Another feature is bilinear UV interpolation in order to reduce artifacts and increase texture quality. This comes at a very small performance cost as shown in the chart. The algorithm was written with minimal computation other than the interpolation (only multiplying at the end, few local variables) in order to be optimized. Shared memory may speed up the memory access part of the procedure, though the low performance cost may not merit further optimization. Vertex color interpolation is also a part of the rasterizer, which comes at a negligible performance cost, since it just requires color to be stored and interpolated once using barycentric coordinates.

About

Rasterizer implemented on the GPU using CUDA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C 44.0%
  • C++ 24.1%
  • CMake 24.0%
  • Cuda 5.3%
  • GLSL 1.9%
  • Objective-C 0.6%
  • Makefile 0.1%