FEM, CFD, CAE, and all that.

## Tag: Math

### Cantor set by Mathematica

Essentially one can not really show the complete Cantor set, here by its definition is what’s left after first n remove of “one-third” open sets. The Mathematica script is then:

Cantor[x__, n_] :=
Nest[Join[Partition[#[[All, 1]], 1],
Partition[#[[All, 1]] + (#[[All, 2]] - #[[All, 1]])/3, 1],
2] \[Union]
Join[Partition[#[[All, 1]] + 2 (#[[All, 2]] - #[[All, 1]])/3, 1],
Partition[#[[All, 2]], 1], 2] &, x, n]

Now

Cantor[{{0, 1}}, 2]

gives

{{0, 1/9}, {2/9, 1/3}, {2/3, 7/9}, {8/9, 1}}

where the first argument is the interval to be sliced, and the second argument gives the level (iteration times) of generation. The output is the intervals left after the remove action. Try

Cantor[{{0, 1}}, 3]

we have

{{0, 1/27}, {2/27, 1/9}, {2/9, 7/27}, {8/27, 1/3}, {2/3, 19/27}, {20/
27, 7/9}, {8/9, 25/27}, {26/27, 1}}

and

Cantor[{{0, 1}}, 4]

{{0, 1/81}, {2/81, 1/27}, {2/27, 7/81}, {8/81, 1/9}, {2/9, 19/
81}, {20/81, 7/27}, {8/27, 25/81}, {26/81, 1/3}, {2/3, 55/81}, {56/
81, 19/27}, {20/27, 61/81}, {62/81, 7/9}, {8/9, 73/81}, {74/81, 25/
27}, {26/27, 79/81}, {80/81, 1}}

and

Cantor[{{0, 1}}, 5]

{{0, 1/243}, {2/243, 1/81}, {2/81, 7/243}, {8/243, 1/27}, {2/27, 19/
243}, {20/243, 7/81}, {8/81, 25/243}, {26/243, 1/9}, {2/9, 55/
243}, {56/243, 19/81}, {20/81, 61/243}, {62/243, 7/27}, {8/27, 73/
243}, {74/243, 25/81}, {26/81, 79/243}, {80/243, 1/3}, {2/3, 163/
243}, {164/243, 55/81}, {56/81, 169/243}, {170/243, 19/27}, {20/27,
181/243}, {182/243, 61/81}, {62/81, 187/243}, {188/243, 7/9}, {8/9,
217/243}, {218/243, 73/81}, {74/81, 223/243}, {224/243, 25/27}, {26/
27, 235/243}, {236/243, 79/81}, {80/81, 241/243}, {242/243, 1}}

Plotting could be done using

PlotCantor[{x1_, x2_}] :=
Plot[x = 1, {x, x1, x2}, Filling -> Bottom,
PlotRange -> {{0, 1}, {0, 1}}, Axes -> {True, False}]
Show[PlotCantor /@ Cantor[{{0, 1}}, 2]]
Show[PlotCantor /@ Cantor[{{0, 1}}, 4]]

### Whydomath.org

Whydomath.org was launched recently by SIAM. Based on nodes, it contains many interesting short, multimedia pieces dedicated to promote public math understanding. It’s fun to read between endless code writing and derivation. To be noticed, there is one node for tsunami on the way….

### review of CG method:III

This Mathematica script generates the following picture indicating a typical behavior of steepest descent method: the zigzag in two perpendicular directions. In fact, in 2D this is the only case because one has only two choices, determined by starting direction, as I showed at the beginning. It is natural to ask whether one can use only two step with single change of direction to reach the minimization point. This introduces conjugate direction method.

Assume $x\in R^n$ and the iteration starts with direction $d_0$. The method of conjugate direction tries to reach a point along $d_i$ with condition that in the rest of iteration descent along $\pm d_o$ direction would never happen again, so does for any other following steps. To make sure this happen, we can require that the iterations after $d_i$ always are performed on the subspace perpendicular to $d_i$. Remembering that exact solution $x^{\ast}$ should always be in the subspace in which the coming iterations are to be performed, we come up this scheme:

$x_{i+1}=x_i+l_id_i, \quad d_i\perp (x_{i+1}-x^{\ast})=e_{i+1}=e_i+l_id_i\Longleftrightarrow d_i^T(e_i+l_id_i)=0$

If we view the above as two equations with three unknowns ($x_{i+1},l_i,x^{\ast}$), we know that it’s not going to work. However, since $A$ is SPD matrix, by Cholesky decomposition we have $A=L^TL$ in which $L$ is an upper-triagonal matrix. Then if we consider the orthogonality between $Ld_i$ and $Le_{i+1}$, instead of $d_i$ and $e_{i+1}$, we have

$Ld_i\perp Le_{i+1}\Rightarrow (Ld_i)^T Le_{i+1}=0\Rightarrow d_i^TL^TLe_{i+1}=d_i^TAe_{i+1}=d_i^TA(e_i+l_id_i)=d_i^T(Ax_i-b+Al_id_i)=d_i^T(-r_i+Al_id_i)=0$

Finally we have

$l_i=d_i^Tr_i/d_i^TAd_i$

Of course this can be obtained by using the $\frac{df(x_{i+1})}{d l_i}=0$ requirement as well, but derivation I used here gives better geometric explanation. In order to utilize the fact that we do not know $e_i$ but we know $r_i=b-Ax_i=-Ae_i$, we impose the orthogonality requirement $d_i\perp e_{i+1}$ in the space transformed by $L$. By now one can see that to consider $r_i$ as the effect of $A$ applied on $e_i$ can often serve the similar purpose of $e_i$ the error itself, that’s why sometimes we call $r_i$ a similar name: residual.

The orthogonalilty condition $Ld_i\perp Le_{i+1}\Longleftrightarrow d_i^TAe_{i+1}$ is also called $A$-orthogonality. And by $d_i$ is $A$-orthogonal to the following $e_j$ one can easily see that $d_i$ is $A$-orthogonal to the following $d_j$. Since now we have the formular for $l_i$, with a set of $A$-orthogonal $d_j, j=0,1,\cdots,n-1$ we will be done, and this set can be found by Gram-Schimdt conjugation, a $A$-orthogonal version of Gram-Schimdt orthogonazation process. With those tools in hand, it is not hard to show that this conjugate direction method, whose name comes from that the directions $d_j$ are $A$-orthogonal, or, $A$-conjugate to each other, can indeed reach $x^{\ast}$ within $n$ steps. In fact,  the goal in each interation in conjugate direction method achieved is to eliminate one dimension of the space in which the solution lies and confine the rest of search to a lower dimension hyperplane.

### review of CG method:II

Now let’s examine the convergence property of steepest descent method. For this purpose, define the error against exact solution $x^{\ast}$ at each iteration step

$e_i\equiv x_i-x^{\ast}$

By iteration scheme there is

$e_{i+1}=e_i+l_ir_i$

Since at every step matrix $A$ would be multiplied, and the iterative property of a matrix is largely determined by its eigenvalues, one’s first reaction should be look at the decomposition of the vector space under discussion into its standard forms by

$e_i=\sum_j{\xi_{(i)j}v_j}$

where $\{v_j\}$ is the orthonormal basis of eigenvector space of $A$. By this defitino, there are

$r_i=b-Ax_i=Ax^{\ast}-Ax_i=-Ae_i$

$e_{i+1}=e_i+l_ir_i=e_i+\frac{r_i^Tr_i}{r_i^TAr_i}r_i=e_i+\frac{\sum_j\xi_{(i)j}^2\lambda_j^2}{\sum_j\xi_{(i)j}^2\lambda_j^3}r_i$

In which $\lambda_j$ is eigenvalue corresponding to $v_j$. One can see if the normal/spectral condition number $\kappa(A)\equiv\lambda_{\max}(A)/\lambda_{\min}(A)=0$, i.e. all the eigenvalues of $A$ are same, the error at next step immediately drops to zero. Geometricly this simply means contours are spheres and the steepest descent direction is toward the centre. Generally, there is

$e_{i+1}=(I-l_iA)e_i\Rightarrow \|e_{i+1}\|_A^2=\|e_i\|_A^2\omega_i^2$

where the norm is energy norm induced by $A$

$\|x\|_A=\sqrt{\frac{1}{2}x^TAx}$

and $\omega_i$ (subscript emphasizes the dependence on each step) is determined by spectral condition number $\kappa(A)$ and slope $\mu_i=\xi_{(i)\max}/\xi_{(i)\min}$:

$\omega_i^2=1-\frac{(\kappa^2+\mu_i^2)^2}{(\kappa+\mu_i^2)(\kappa^3+\mu_i^2)}$

Picture below shows the dependence of $\omega$ to two variables. One can see that the slowest convergence happens when both condition number and slope are large. $A$ is referred to as ill-conditioned when $\kappa$ is large.