Three faces of FEM

by Yi Zhang

Finite Element Method (FEM) acquires several interpretations through its developments over decades. Assuming we are solving a PDE of form


where L is partial-differential operator, u=u(x) is unknown function. FEM could be interpreted in the following ways (let me know if you have extra answers):

  • Rayleigh-Ritz
  • Galerkin method
  • Method of Weighted Residuals (MWR)

The first one, Rayleigh-Ritz method, is actually how structure engineers were inspired to devise FEM. The connection of this method and FEM is the two way street of variational formulation of PDE, which gives the Euler equation of certain functional over u(x). The derivation is done through searching for the minimum of the functional (usually some form of energy ). Rayleigh-Ritz method addresses the PDE problem by assuming the form of u as

u(x)=\sum_i^N u_i \Phi_i(x)

then bringing this formulation into PDE’s functional formulation, in which u_i‘s are acquired by solving a minimization problem. Original Rayleigh-Ritz method adopts functions with global support as \Phi_i(x). A textbook example is the boundary value problem of a beam’s modes, in which we can assume \Phi(x) as sinusoidal functions. \Phi_i(x) are called trial functions.

In order to have a PDE’s variational formulation in a general way, and in the same time remove some regularity requirement of the solution, Galerkin method is used. The so called test function are the ones with expected regularity (usually infinitive). By the procedure of Galerkin method, test functions \Psi_i(x) are applied to L(u)=0 and the integral formulation is acquired. If L is linear, this integral formulation is the same as functional formulation in minimization problem by Rayleigh-Ritz method. Here we have introduced two sets of functions, trial functions which define the finite dimensional approximation of u(x), and test functions which defines how close (accurate) the PDE is solved. The name of “test” can be understood in the following way. How close a variable v is close to zero could be measured by test it with other variables:


where w is known variable used for testing, and ( , ) is proper inner product. Conceptually, the greater the variable domain consisting of w, the closer to zero v is. And if v can make every w satisfy above equation, it should be zero itself by definition.

So we can see, the space of test function \{\Psi_i\} defines the extension, and in turn, accuracy, the PDE L(u)=0 stands by having

(\Psi_i, L(u))=0

A subcategory of Galerkin method is Bubnov-Galerkin method, where test functions and trial functions are the same. Otherwise, it’s called Petrov-Galerkin method. Historically, trial function is also commonly referred as shape function.

Another abstraction of FEM is by MWR. Since all the numerical methods for solving PDE relies on moving from infinite dimensional solution spaces to finite ones, errors are introduced in all the methods. So instead of having L(u) equal to exact zero, u(x)‘s numerical approximation \tilde{u}(x) introduces residual R(x):


If numerical solution is in the form of trial function’s expansion, the coefficients (coordinates) u_i could be acquired by restricting R. In the light of above philosophy, R could be restricted as to be zero in some sense, specifically,


for some \Psi_i and (\cdot,\cdot). This can be looked as putting weight on some locations defined by \Psi_i: the greater \Psi_i is on certain location, the more restricted R is there. By this \Psi_i‘s are also called weighted functions.

Though mathematically Galerkin method and MWR are in the same look, they are actually on different emphasis. In the former, we are specifically looking for integral form, with shifting regularity to test functions in mind, while in MWR we are focusing on minimizing residuals, and following integral form is just the math “trick” played on certain weight functions. In fact, by using different weight function in MWR, we can recover other numerical discretizations. Besides Bubnov/Petrov Galerkin methods, \Psi_i=\delta(x-x_i) gives collocation method, \Psi_i=1 within certain cell and 0 otherwise gives FVM, and \Psi_i=\partial R/\partial u_i gives least squares method.