segunda-feira, 19 de agosto de 2013

Implementing the Diffusion DoF

As explained previously, the diffusion DoF algorithm can be separated in the following steps:


  • Calculate the matrix coefficients (a, b, c) based on scene CoC for each row of the image.
  • Solve the tridiagonal system for each row.
  • Calculate the matrix coefficients (a, b, c) based on scene CoC for each column of the image.
  • Solve the  tridiagonal system for each column.

Calculating the coefficients

From the previous post one know that:

a = -s_{i-1}
b = 1 + s_{i-1} + s_i
c = -s_i

The index is used because the CoC changes between pixels.
And with some math explained here one know that:

s_i = CoC_i^2

Solving the system

Solving the system is the more complicated step. As you probably noticed, solving it in CPU with the TDMA algorithm is easy but this algorithm is serial, thus we have to find a more appropriated algorithm.
The algorithm I choose was the Cyclic Reduction. This algorithm, given a tridiagonal system, reduces the original system to a new one with half the variables. Therefore one can recursively reduce it until one reaches a system with fewer variables that can be solved fast. After that one can perform a back substitution in the system of n variables using the result from system n/2 until one reaches the final solution.

Supose one have the following system:

a_1x_1 + b_1x_2 + c_1x_3 = y_1
a_2x_2 + b_2x_3 + c_2x_4 = y_2
a_3x_3 + b_3x_4 + c_3x_5 = y_3


let:
k_1 = - \frac{a_2}{b_1}
k_2 = - \frac{c_2}{b_3}

Then multiply eq 1 by k1 , eq 3 by k2 and sum the three eqs:


(a_1k_1)x_1 + (c_1k_1 + b_2 + a_3k_2)x_3 + (c_3k_2)x_5 = y_1k_1 + y_2 + y_3 k_2

So now one have a reduced system also tridiagonal and with only odd indices. One can recursively apply this technique until reach a system with only one or two unknowns.
Thus the new coefficients are:

a_1^{'} = a_1 k_1
b_{1}^{'} = c_{1} k_{1} + b_{2} + a_{3} k_{2}
c_{1}^{'} = c_{3} k_{2}
y_{1}^{'} = y_{1} k_{1} + y_{2} + y_{3} k_{2}


Then after reducing and solving the odd indices one can perform the back substitution solving the even indices:



a_1x_1 + b_1x_2 + c_1x_3 = y_1
x_2 = \frac{(y_1 - a_1x_1 - c_1x_3)}{b_1}

With this algorithm one can reduce in parallel all odd indices and solve in parallel all even indices.
Notice that one can also reduce both odd and even indices at the same time, getting two systems with half unknowns, with this method in each step one doubles the number of systems and reduces the unknowns by half, this is called PCR (Parallel Cyclic Reduction).

So this algorithm is pretty simple but its implementation using fragment shaders was painful.
As we are solving a system with CR, any deviation in the values can cause a huge difference in the result because the error propagates in each step. Thus the use of texture coordinates to access the data lead to some problems:
  • A small offset in the texcoord causes interpolation between neighborhood pixels leading to wrong values.
  • We must treat  access to out of bounds texture values.
  • In back substitution stage we must 'process' only even pixels and copy odd ones.
Then when I'm writing code, always passes a mistake or another, and you can't easily debug shaders, or even notice a small deviation in pixel colors (in that case the problem variables). Summing all that, it took a whole week and some days to get everything working properly. But in the end it was rewarding to see the result.

 
Notice that the video quality isn't very good, it looks much better live.

sexta-feira, 9 de agosto de 2013

DoF techniques

This week I started to research about DoF (Depth of Field) techniques, besides the classical gaussian blur I found 2 interesting techniques: the hexagonal DoF with bokeh from Frostbite2 and the Diffusion DoF used in Metro2033.
For the first, I was able to implement the hexagonal blur, but I yet didn't figured out how to compose the final image based on the blured image and the per-pixel CoC (Circle of Confusion). Simply interpolating between the original image and the blurred based on the CoC didn't looks good, and also there is the bleeding of in-focus/out-of-focus pixels colors. Here are some images (notice that I'm faking hdr to make brighter spots more perceptible)


Diffusion DOF

On the other hand the Diffusion DoF, in a elegantly way, uses the heat diffusion equation to implement the DoF effect. This method treat the image as a medium with each pixel color being its heat. Considering the pixel CoC as the heat conductivity of the pixel, we can simulate the heat diffusion  "spreading" the color among the pixels. Following we have the heat equation:

\frac{\partial u}{\partial t} = \beta \left(\frac{\partial^2 u}{\partial x^2} + \frac{\partial^2 u}{\partial y^2}\right)

Using the Alternate Direction Implicit (ADI) method we alternate the directions (x,y) solving two one dimensional problem at each time step. This 2D ADI method is unconditionally stable, this means that we can use any time step we want.
Simplifying the notation we have:

u_j^k = u_j^{k+1} + \frac{\beta \Delta t}{\Delta x^2}\( u_{j+1}^{k+1} - 2u_j^{k+1} + u_{j-1}^{k+1}\)
u_j^k = \(1+2s\)u_j^{k+1} - s\(u_{j+1}^{k+1} + u_{j-1}^{k+1}\)
where: s = \frac{\beta \Delta t}{\Delta x^2} , and u_{j}^{k} is the color in position j on time k.
 As the new value of u in the time k+1 depends on its adjacents values, we need to solve a system where:
\left( \begin{array}{cccccccc} b & c & & & & & & 0 \\ a & b & c & & & & & \\ & & & & & & & \\ & & & (...) & & & & \\ & & & & & & & \\ & & & & a & b & c & \\ & & & & & a & b & c\\ 0 & & & & & & a & b\\ \end{array} \right) \times \left( \begin{array}{c} u_0^{k+1} \\ u_1^{k+1} \\ \\ (...) \\ \\ \\ u_{n-1}^{k+1} \\ u_n^{k+1} \\ \end{array} \right) = \left( \begin{array}{c} u_0^k \\ u_1^{k} \\ \\ (...) \\ \\ \\ u_{n-1}^{k} \\ u_n^{k+1} \\ \end{array} \right)
with:
a = -s
b = 1 + 2s
c = -s

For this kind of matrix (tridiagonal)  we have fast methods to to solve the system, in my CPU implementation, I used the TDMA (also known as the Thomas algorithm).
Therefore we first have to calculate the diffusion in x direction and them in the y direction. Notice that for each row/column, we need to solve one system.

Before going to implement it in CrystalSpace, I implemented  it first in CPU to have a reference and here are the result:

Original image

CoC image

First step(diffusion in x direction)

Second step(diffusion in y direction)

You can download the code here.