### review of CG method:IV

I have shown that in method of conjuate direction, the searching directions are in some sense orthogonal to each other:

$d_i^TAd_j=0,\quad i\neq j$

Which makes more sense when the whole thing is considered under a new coordinate system, a system obtained by impose tranform of $L$, in which $L$ is the upper-triangle matrix from $A$‘s Cholesky decomposition.

$d_i^TAd_j=0\Longleftrightarrow (Ld_i)^T(Ld_j)=0\Longleftrightarrow Ld_i\perp Ld_j$

Similar conclusion includes that $d_i$ is $A$-orthogonal to $e_{i+1}$, resulting in that $d_i$ is orthogonal to $r_{i+1}$ the residual:

$d_i^Tr_{i+1}=0\Rightarrow d_i^Tr_j=0,\quad i\neq j$
This condition actually implies that one can use the residual $r_i$ processed by  Gram-Schimdt conjugation to obtain the direction $d_i$, and this is the conjugate gradient (CG) method. The advantage of CG is that, instead of finding a set of directions $d_i$ at the beginning, $r_i$ is obtained along with the iteration, because the process of Gram-Schimdt conjugation is “triagonal”, i.e., one variable is determined by its predecessors. In particular, starting from $r_0=b-Ax_0$ (the first step is the same as steepest descent method), the following searching directions $d_i$ is determined by

$d_i=r_i+\sum_{k=0}^{i-1}\alpha_{ik}d_k$

Coefficient $\alpha_{ik}$ is obtained by conjugate condition:

$d_i^TAd_j=0\Longrightarrow r_i^TAd_j+\sum_{k=0}^{i-1}\alpha_{ik}d_k^TAd_j=r_i^TAd_j+\alpha_{ij}d_j^TAd_j=0$

Then we have $\alpha_{ij}=-r_i^TAd_j/(d_j^TAd_j)$

In order to simplify this expression, remember that during this process of generating $d_i$ from $r_i$, for any $i$,

$\text{span}\{d_0,d_1,\cdots,d_i\}=\text{span}\{r_0,r_1,\cdots,r_i\}$

so $d_i^Tr_j=0\Longrightarrow r_i^Tr_j=0,\quad i\neq j$

The Gram-Schimdt process also indicates that $d_i^Tr_i=r_i^Tr_i$ (prove it). The above two identities can be used to simplify $\alpha_{ij}$, and prove that

$\alpha_{ij}=r_i^Tr_i/(l_{i-1}d_{i-1}^TAd_{i-1}),\quad j=i-1$

$\alpha_{ij}=0,\quad j

and eventually

$\alpha_{i,i-1}=r_i^Tr_i/(r_{i-1}^Tr_{i-1})$

Finally, a complete iteration, when $d_i, x_i,r_i$ is known, consists of

$l_i=r_i^Tr_i/d_i^TAd_i\longrightarrow x_{i+1}=x_i+l_id_i\longrightarrow r_{i+1}=r_i-l_iAd_i$

$\alpha_{i+1}\equiv\alpha_{i+1,i}=r_{i+1}^Tr_{i+1}/(r_i^Tr_i)\longrightarrow d_{i+1}=r_{i+1}+\alpha_{i+1}d_i$