We consider a fixed point problem for function f:
Banach fixed-point theorem: Let (X, d) be a non-empty complete metric space with contraction mapping f : X → X. Then f admits a unique fixed point x✶ in X, namely, f(x✶) = x✶. Furthermore, x✶ can be found as follows: starting with an arbitrary element x₀ ∈ X and define a sequence {xk}k∈ℕ by xk+1 = f(xk), for k = 0, 1, 2, …. Then \[ \lim_{k\to \infty} {\bf x}_k = {\bf x}^{\ast} . \]
Upon adding and subtracting the identity matrix, we can reduce this problem to fixed point form \[ \mathbf{x} = \mathbf{T}\,\mathbf{x}, \qquad \mathbf{T}\,\mathbf{x} = \left( \mathbf{I} - \mathbf{A} \right) \mathbf{x} + \mathbf{b} . \] If T is is a contracting map (e.g., ∥I − A∥ < 1 in some norm), then Banach's theorem ensures that the recurrence \[ \mathbf{x}^{(k+1)} = \mathbf{T}\,\mathbf{x}^{(k)} , \qquad k=0,1,2,\ldots , \] converges to the unique fixed point x✶
, which solves the system.For instance, \[ \mathbf{A} = \begin{bmatrix} 0.5 & 0.2 \\ 0.1 & 0.4 \end{bmatrix} , \qquad \mathbf{b} = \begin{pmatrix} 1 \\ 2 \end{pmatrix} . \] Fixed point formulation for this case reads: \[ T(\mathbf{x} ) = \left( \mathbf{I} - \mathbf{A} \right) \mathbf{x} + \mathbf{b} . \] Then \[ T(\mathbf{x} ) = \begin{bmatrix} 0.5 & -0.2 \\ -0.1 & 0.6 \end{bmatrix} \mathbf{x} + \begin{pmatrix} 1 \\ 2 \end{pmatrix} . \] Now we check contraction of T; indeed, I − A has spectral radius less than 1, and in the [ℓ∞ norm:
Example 1B: Let us consider the Fredholm-type integral equation \[ u(x) = \int_0^1 K(x,y) \,u(y) \,{\text d}y + f(x) , \] where K(x, y) is a continuous kernel and f ∈ ℭ([0, 1]). We seek a function u ∈ ℭ([0, 1]) satisfying this integral equation.
We define an operator T on the Banach space ℭ([0, 1]) by \[ \left( T\, u \right) (x) = \int_0^1 K(x,y) \,u(y) \,{\text d}y + f(x) . \] If the kernel K satisfies a norm condition such as \[ \sum_{x \in [0, 1]} \int_0^1 \left\vert K(x,y) \,u(y) \right\vert {\text d}y < L < 1, \] then T is a contraction map. Banach's theorem guarantees a unique fixed point, which solves the integral equation.
In partcular, \[ u(x) = \int_0^1 xy\,u(y)\,{\text d} y + \sin (x). \] This linear in u and the kernel K(x, y) = xy satisfies \[ \sup_{0 \le x \le 1} \int_0^1 \left\vert xy\right\vert {\text d} y = \sup_{0 \le x \le 1} x\,\int_0^1 \left\vert y\right\vert {\text d} y = \sup_{0 \le x \le 1} x \cdot \frac{1}{2} = \frac{1}{2} . \] So T is a contraction on ℭ([0, 1]) with the sup norm, and Banach's theorem guarantees a unique solution in ℭ([0, 1]).
Example 1C: Let (X, d) be a complete metric space, and suppose we have a transformation T : X ↦ X that models a geometric process---say, iterated function system (IFS) used in fractal generation.
If each map in the IFS is a contraction, then the composition T is also a contraction. Banach's theorem ensures the existence of a unique fixed point x✶ ∈ X, which is the attractor of the system.
The classic constraction of the Cantor set or the Serpinski triangle relies on this pronciple. The fixed point is not a number, but a compact subset of the plane---beautifully illustrating how Banach's theorem governs geometric convergence.
In particular, let us take X = ℝ² with Euclidean metric. We define T : ℝ² ↦ ℝ² by \[ T(x, y) = \left( \frac{1}{2}\, x + 1 , \ \frac{1}{2}\,y \right) . \] Now checking its contraction for any two points (x₁, y₁) and (x₂, y₂) from ℝ², we have \[ \| T\left( x_1 , y_1 \right) - T\left( x_2 , y_2 \right) \| = \left\| \left( \frac{1}{2} \left( x_1 - x_2 \right) , \frac{1}{2} \left( y_1 - y_2 \right) \right) \right\| = \frac{1}{2} \left\| \left( x_1 - x_2 , \ y_1 - y_2 \right) \right\| . \] So T is a contraction with constant ½, and the unique fixed point is \[ \left( x, y \right) = \left( \frac{1}{2}\,x + 1 , \ \frac{1}{2}\,y \right) \quad \Rightarrow \quad x = 2, \ y = 0. \] This example models geometric transformation that shrinks and shifts points toward a fixed location---like a simple affine IFS step. ■
For future consideration, we need to exploit the so-called Jordan normal form of matrix H, namely, H = S−1JS, where S is an invertible matrix and J is a block diagonal matrix consisting of the Jordan blocks, \[ \mathbf{J} = \begin{bmatrix} \fbox{J_1} &&& \\ & \fbox{J_2} && \\ &&\ddots& \\ &&& \fbox{J_m} \end{bmatrix} , \] and each Jordan block has the form:
We split every Jordan block into sum of the diagonal matrix and the nilpotent matris, namely, Ji = λiI + P. Notice, that Ps = 0 for some positive integer s ≤ ni.
Then every integer power of H can be written as
Theorem 2: Let H be an n-by-n matrix. Then Hkz → 0 for arbitrary column vector z if and only if each eigenvalue of H is less than 1 in absolute value.
Hkz diverges when ρ(H) > 1. Here ρ(H) is the largest modulus of the eigenvalues of the matrix H, This number ρ(H) is called the spectral radius of matrix H.
If u is complex, then u = x + ⅉy, with some real vectors x, y. However, then at least one of the sequences (Hkx) or (Hky) does not tend to zero. For if both do, then also Hku = Hkx + ⅉHky → 0 and this contradicts \( \displaystyle \quad \| \mathbf{H}^k \mathbf{u} \|_{\infty} > \gamma . \)
Now let ρ(H) < 1, and assume for simplicity that H possesses n linearly independent eigenvectors (uj) such that H uj = λjuj. Linearly independence means that every z ∈ ℝn×1 can be expressed as a linear combination of the eigenvectors, i.e., there exist constants cj ∈ ℂ such that \( \displaystyle \quad \mathbf{z} = \sum_{1\le j \le n} c_j \mathbf{u}_j . \quad \) Hence, \[ \mathbf{H}^k \mathbf{z} = \sum_{j=1}^n c_j \lambda_j^k \mathbf{u}_j , \] and since |λj| ≤ ρ(H) < 1, we have \( \displaystyle \quad \lim_{k\to\infty} \mathbf{H}^k \mathbf{z} = 0 , \quad \) as required.
To complete the proof in general case, we need to exploit the so-called Jordan normal form of matrix H, namely, H = S−1JS, where S is an invertible matrix and J is a block diagonal matrix consisting of the Jordan blocks, \[ \mathbf{J} = \begin{bmatrix} \fbox{J_1} &&& \\ & \fbox{J_2} && \\ &&\ddots& \\ &&& \fbox{J_m} \end{bmatrix} , \] \[ \mathbf{J}_i = \begin{bmatrix} \lambda_i & 1 &0&\cdots & 0 \\ 0 & \lambda_i & 1 & \cdots & 0 \\ &&&\ddots& \\ 0&0&0& \cdots & \lambda_i \end{bmatrix} \in \mathbb{R}^{n_i \times n_i} . \] To prove that Jki → 0 for |λi| < 1, we split the Jordan block into sum of the diagonal matrix and the nilpotent matris, namely, Jki = λjI + P. Notice, that Ps = 0 for some positive integer s ≤ ni. Then we evaluate the terms of the binomial expansion \[ \left( \lambda_i \mathbf{I} + \mathbf{P} \right)^k = \sum_{j=0}^{n_i -1} \binom{k}{j} \lambda_i^{k - j} \mathbf{P}^j . \] Since the binomial coefficients are polynomials in k, their product with &lambdak tends to zero as k → ∞ because |λ| < 1.
Corollary 1: For a square matrix H, Hkz converges to some limit vector if and only if all eigenvalues of H are less than 1 in modulus except possible simple eigenvalues λ = 1.
Corollary 2: Let x✶ be a solution of A x = b satisfying x✶ = H x✶ + v. Suppose that we are given an iterative scheme: \[ \mathbf{x}^{(k+1)} = \mathbf{H}\,\mathbf{x}^{(k)} + \mathbf{v} , \qquad \mathbf{x}^{(0)}, \ \mathbf{v} \in \mathbb{R}^{n\times 1} . \] Then x(k) → x✶ for any choice of x(0) if and only if ρ(H) < 1.
Lemma 1: Let A be a singular square matrix having eigenvector x corresponding to the zero eigenvalue. For any (nonsingular) preconditioner S, splitting A = S + (A − S) generates matrix B = I − S−1A that has the same eigenvector x corresponding to eigenvalue λ = 1.
Theorem 3 (V. Dobrushkin, 2025): Let H be a square matrix with a simple eigenvalue λ₁ = 1 corresponding to the eigenvector u₁, and suppose all other eigenvalues of H have modulus strictly less than 1. If β is a basis of ℝn×1 that includes u₁,, then any vector z in ℝn×1 admits a unique decomposition z = c₁u₁ + ⋯, where the remaining terms lie in the span of eigenvectors associated with eigenvalues of modulus less than 1. Under repeated application of H, the influence of these subdominant components vanishes, and the iterates converge: \( \displaystyle \quad \lim_{k\to\infty} \mathbf{H}^k \mathbf{z} = c_1 \mathbf{u}_1. \)
To complete proof in general case, we need to exploit so-called Jordan normal form of matrix {\bf H}, namely, $\mathbf{H} = \mathbf{S}^{-1}\mathbf{J}\mathbf{S}$, where $\mathbf{J}$ is a block diagonal matrix consisting of the Jordan blocks. The rest of the proof is straight forward and it follows \cite{FW60}, which is based on the property that $\mathbf{J}_i^k \to 0$ for $i=2,3,\ldots , n$.
- Darve, E. and Wootters, M., Numerical Linear Algebra with Julia, SIAM, Philadelphia, 2021.
- Forsythe, G.E. and Wasow, W.R., Finite Difference Methods for Partial Differential Equations. New York: John Wiley & Sons, Inc., 1960.
- Loehr, N., Advanced Linear Algebra, CRC Press, 2014.
- Watkins, D.S., Fundamentals of Matrix Computations, Wiley; 3rd edition, 2010.
- White, R.E., Computational Linear Algebra with Applications and MATLAB Computations, CRC Press Boca Raton, FL, 2023. doi: 10.1201/9781003304128
