1 post published by Kostas Anagnostou during August 2025| Interplay of Light
Drill deep into a GPU’s architecture and at its heart you will find a large number of SIMD units whose purpose is to read data, perform some vector or scalar ALU (VALU or SALU) operation on i…| Interplay of Light
GPUs make work parallelism very easy by design: each drawcall/dispatch shader instruction operates on batches of vertices, pixels, threads in general at the same time automatically. On the other ha…| Interplay of Light
Mesh shaders, introduced back in 2018 as an NVidia Turing and later as an AMD RDNA2 feature, is an evolution of the geometry pipeline which removes a number of fixed function units like the Input Assembler and Tessellator as well as the Vertex shader/Domain Shader/Geometry Shader stages and replaces them with a simpler, programmable pipeline […]| Interplay of Light
“Low-level thinking in high-level shading languages” (Emil Persson, 2013), along with its followup “Low-level Shader Optimization for Next-Gen and DX11“, is in my top 3 most influential presentations, one that changed the way I think about shader programming in general (since I know you are wondering the other 2 are Natty Hoffman’s Physically Based Shading […]| Interplay of Light
Recently I started exploring ReSTIR, using mainly the Gentle Introduction to ReSTIR Siggraph course and the original paper. I began with direct illumination (ReSTIR DI), to quickly set it up and get something working. ReSTIR is a very interesting technique that gives great results but there is a lot of Maths behind it that might […]| Interplay of Light
In the previous blog post I discussed how raytracing can be used to achieve order independent transparency (OIT) for some types of transparencies and how it compares to other OIT methods like per pixel linked lists and Multi-layer Alpha blending (MLAB). The basic idea, since DXR doesn’t support distance sorted traversal of the BVH, was […]| Interplay of Light
I had a good question through Twitter DMs about what occupancy is and why is it important for shader performance, I am expanding my answer into a quick blog post. First some context, GPUs, while ru…| Interplay of Light
I posted a few days ago a screenshot of the long shader ISA code produced by the RGA compiler for a single atan2() instruction. The post got quite a large engagement and it felt like a lot of peopl…| Interplay of Light
In the previous blog post I described a simple workgraph implementation of a hybrid shadowing system. It was based on a tile classification system with 3 levels (or nodes in workgraph parlance), on…| Interplay of Light
Workgraphs is a new feature added recently to DirectX12 with hardware support from NVidia and AMD. It aims to enable a GPU to produce and consume work without involving the CPU in dispatching that …| Interplay of Light
About a year ago I reviewed a number of Order Independent Transparency (OIT) techniques (part 1, part 2, part 3), each achieving a difference combination of performance, quality and memory requirem…| Interplay of Light
With recent GPUs and shader models there is good support for 16 bit floating point numbers and operations in shaders. On paper, the main advantages of the a fp16 representation are that it allows p…| Interplay of Light
It is common knowledge that removing unnecessary work is a crucial mechanism for achieving good performance on the GPU. We routinely create lists of visible model instances of example using frustum…| Interplay of Light
Today I set out to replace the old SSR implementation in the toy engine with AMD’s FidelityFX’s one but in the end I got distracted and spent the day studying how it works instead. This…| Interplay of Light
In the previous blog post we discussed how to use a per-pixel linked list (PPLL) to implement order independent transparency and how the unbounded nature of overlapping transparent surfaces can be …| Interplay of Light