• Publications
Show publication details

Widmer, Sven; Pajak, David; Schulz, André; Pulli, Kari; Kautz, Jan; Goesele, Michael; Luebke, David

An Adaptive Acceleration Structure for Screen-space Ray Tracing

2015

Proceedings of the 7th Conference on High-Performance Graphics 2015

High-Performance Graphics (HPG) <7, 2015, Los Angeles, CA, USA>

We propose an efficient acceleration structure for real-time screenspace ray tracing. The hybrid data structure represents the scene geometry by combining a bounding volume hierarchy with local planar approximations. This enables fast empty space skipping while tracing and yields exact intersection points for the planar approximation. In combination with an occlusion-aware ray traversal our algorithm is capable to quickly trace even multiple depth layers. Compared to prior work, our technique improves the accuracy of the results, is more general, and allows for advanced image transformations, as all pixels can cast rays to arbitrary directions. We demonstrate real-time performance for several applications, including depth-of-field rendering, stereo warping, and screen-space ray traced reflections.

Show publication details

Wodniok, Dominik; Schulz, André; Widmer, Sven; Goesele, Michael

Analysis of Cache Behavior and Performance of Different BVH Memory Layouts for Tracing Incoherent Rays

2013

EG PGV 2013

Eurographics Symposium on Parallel Graphics and Visualization (EGPGV) <13, 2013, Girona, Spain>

With CPUs moving towards many-core architectures and GPUs becoming more general purpose architectures, path tracing can now be well parallelized on commodity hardware. While parallelization is trivial in theory, properties of real hardware make efficient parallelization difficult, especially when tracing incoherent rays. We investigate how different bounding volume hierarchy (BVH) and node memory layouts as well as storing the BVH in different memory areas impacts the ray tracing performance of a GPU path tracer. We optimize the BVH layout using information gathered in a pre-processing pass applying a number of different BVH reordering techniques. Depending on the memory area and scene complexity, we achieve moderate speedups.

Show publication details

Schulz, André; Wodniok, Dominik; Widmer, Sven; Goesele, Michael

Extended Data Collection: Analysis of Cache Behavior and Performance of Different BVH Memory Layouts for Tracing Incoherent Rays

2013

This technical report complements the paper "Analysis of Cache Behavior and Performance of Different BVH Memory Layouts for Tracing Incoherent Rays" by Wodniok et al. published in the proceedings of the "Eurographics Symposium on Parallel Graphics and Visualization". Please see this main paper for details. The purpose of this report is to publish the complete data collection the paper is based on using the NVIDIA Kepler architecture plus additional data collected on the NVIDIA Fermi architecture.

Show publication details

Schulz, André; Goesele, Michael [Gutachter]

Improving Cache usage of Tracing Incoherent Rays on GPUs

2012

Darmstadt, TU, Diplomarbeit, 2012

Path tracing and related global illumination techniques create beautiful photorealistic images but computing these images is expensive. The process can be sped up by parallelization as it is embarassingly parallel. With recent developments in CPU technology moving towards many-core architectures and GPUs becoming more general purpose architectures, path tracing can now be parallelized on commodity hardware. Unfortunately, while the parallelization is trivial in theory, in reality hardware details make it more difficult, especially for tracing incoherent rays. This thesis investigates the impact of different bounding volume hierarchy (BVH) and node memory layouts as well as accessing the BVH in different memory areas on the ray tracing performance of a path tracer on a many-core wide SIMD architecture by NVIDIA, the Tesla C2070. Furthermore, we optimize the BVH layout by using information gathered in a pre-processing pass which we use in a number of different BVH reordering techniques. Depending on the memory area and complexity of the scene, we are able to achieve a speedup ranging from negligible to moderate.