Rayleigh-Ritz method: Difference between revisions

Revision as of 06:16, 29 January 2008

The method

We will describe the method from a quantum mechanical point of view and consider as the basic problem to be solved the eigenvalue problem of a Hamilton operator H, which is Hermitian and contains second derivatives. The results can fairly easily be transferred to similar non-quantum mechanical eigenvalue problems and certain differential equations. The essential point of the Rayleigh-Ritz method is the rephrasing of the original problem into another, equivalent, problem, which is the determination of stationary point(s) (usually the minimum) of a certain functional.

So, our purpose is to find an approximate method for solving the eigenvalue equation

H\Phi =E\Phi \,.

We will show that this equation can be replaced by the problem of finding a stationary point of the expectation value of H with respect to Φ

E[\Phi ]\equiv {\frac {\langle \Phi |H|\Phi \rangle }{\langle \Phi |\Phi \rangle }}.

Note that this is a map of the function Φ onto the number E[Φ], i.e., we have here a functional. The bra-ket notation implies the integration over a configuration space, which in electronic structure theory usually is $\scriptstyle \mathbb {R} ^{3N}$ with N being the number of particles of the system described by the Hamiltonian H. We assume boundary conditions on Φ and integration limits such that E[Φ] is finite and the Hamiltonian H is Hermitian. In that case E[Φ] is a real number. The boundary condition in electronic structure theory is the vanishing of Φ at infinite interparticle distances. This corresponds to a 3N fold integral over all of $\scriptstyle \mathbb {R} ^{3N}$ . Bounded configuration spaces with periodic boundary conditions also occur in quantum mechanics and variational Rayleigh-Ritz theory, as outlined below, applies to the latter cases as well.

The variational principle

Most non-relativistic Hamiltionians in quantum mechanics have a lower bound, that is there is a finite number B such that

E[\Phi ]>B\,

for all admissible Φ. The function Φ is admissible when the functional E[Φ] is well-defined (for instance Φ must be twice differentiable) and Φ has the correct boundary conditions. We assume from now on that H has a lower bound.

The highest lower bound B is the lowest eigenvalue E₀ of H. To show this we assume that the eigenvalues and eigenvectors of H are known and moreover that the eigenvectors are orthonormal and complete (form an orthonormal basis of Hilbert space),

H\phi _{k}=E_{k}\phi _{k},\quad E_{0}<E_{1}\leq E_{2},\dots

where for convenience sake we assume the ground state to be non-degenerate. Completeness and orthonormality gives the resolution of the identity

1=\sum _{k=0}|\phi _{k}\rangle \langle \phi _{k}|.

Inserting this resolution into the functional at appropriate places, and using that the Hamilton matrix on basis of eigenvectors is diagonal with E_k on the diagonal, yields

E[\Phi ]={\frac {\sum _{k}\langle \Phi |\phi _{k}\rangle E_{k}\langle \phi _{k}|\Phi \rangle }{\sum _{k}\langle \Phi |\phi _{k}\rangle \langle \phi _{k}|\Phi \rangle }}={\frac {\sum _{k}|c_{k}|^{2}E_{k}}{\sum _{k}|c_{k}|^{2}}}\geq {\frac {E_{0}\sum _{k}|c_{k}|^{2}}{\sum _{k}|c_{k}|^{2}}}=E_{0},

where $\scriptstyle c_{k}\equiv \langle \phi _{k}|\Phi \rangle$ . Note further that

E[\phi _{0}]=E_{0}\,,

that is, the lowest possible expectation value of H is the lowest eigenvalue of of H; this lowest bound is achieved when the expectation value is computed from the lowest eigenvector. This result is referred to as the variational principle.

Matrix eigenvalue problem

It will be convenient to rewrite the functional in the the Lagrange form of undetermined multipliers. That is, we redefine the functional as

E[\Phi ]=\langle \Phi |H|\Phi \rangle -\lambda \langle \Phi |\Phi \rangle ,

where λ is an undetermined multiplier. Take now an arbitrary expansion set χ_k and writing

\Phi =\sum _{k=1}^{M}c_{k}\chi _{k}

we get upon expansion the expression for the functional

E[\Phi ]=\sum _{k,m=1}^{M}\left[c_{k}^{*}c_{m}H_{km}-\lambda c_{k}^{*}c_{m}S_{km}\right]=\mathbf {c} ^{\dagger }\mathbf {H} \mathbf {c} -\lambda \mathbf {c} ^{\dagger }\mathbf {S} \mathbf {c}

with the M × M matrices

H_{km}\equiv \langle \chi _{k}|H|\chi _{m}\rangle ,\quad {\hbox{and}}\quad S_{km}\equiv \langle \chi _{k}|\chi _{m}\rangle .

Stationary points—such as minima—of the functional are obtained from equating derivatives to zero,

{\frac {\partial E[\Phi ]}{\partial c_{k}^{*}}}=0\quad {\hbox{and}}\quad {\frac {\partial E[\Phi ]}{\partial c_{k}}}=0.

Letting k run from 1 to M, these equations give rise to

\mathbf {H} \mathbf {c} =\lambda \mathbf {S} \mathbf {c} \quad {\hbox{and}}\quad \mathbf {c} ^{\dagger }\mathbf {H} =\lambda \mathbf {c} ^{\dagger }\mathbf {S} ,

which, however, are the same equations because the matrices H and S are Hermitian and λ is therefore real. Hence, a stationary point of the functional is obtained by solution of the generalized matrix eigenvalue equation,

\mathbf {H} \mathbf {c} =E\mathbf {S} \mathbf {c} \qquad \qquad \qquad \qquad \qquad \qquad (1)

where we changed the notation from λ to E.

Make now two assumptions: the set χ_k is complete and the set is orthonormal. (The second assumption is non-essential, but simplifies the notation). In an orthonormal basis the following simple expression holds for the expansion coefficients,

c_{k}=\langle \chi _{k}|\Phi \rangle

In general completeness of the expansion requires infinite M. Introduce the resolution of identity

1=\sum _{k=1}^{M}|\chi _{k}\rangle \langle \chi _{k}|

into

\sum _{k}\langle \chi _{m}|H|\chi _{k}\rangle \langle \chi _{k}|\Phi \rangle =E\sum _{k}\delta _{mk}\langle \chi _{k}|\Phi \rangle \quad \Longrightarrow \quad \langle \chi _{m}|H|\Phi \rangle =\langle \chi _{m}|\Phi \rangle ,\quad m=1,2,\ldots ,M.

Multiply on the left by $\scriptstyle |\chi _{m}\rangle$ , sum over m, and use the resolution of the identity again, then we find the result that we set out to prove: The determination of stationary points of the functional E[Φ] in a complete basis is equivalent to solving

H\Phi =E\Phi .

Ritz's important insight was that a function expanded in a non-complete basis χ_k that minimizes the expectation value of H, gives an approximation to the lowest eigenvector of H.

Reiterating and summarizing the above, the Rayleigh-Ritz variational method starts by choosing an expansion basis χ_k of dimension M. Then this expansion is inserted into the expectation value of the Hamiltonian, whereupon the solution of the generalized matrix eigenvalue (1) yields stationary points (usually minima). The lowest eigenvector of this M dimensional matrix problem approximates the lowest exact solution of the Schrödinger equation. Clearly, the success of the method relies to a large extent on the quality of the expansion basis.

History

In the older quantum mechanics literature the method is known as the Ritz method, called after the mathematical physicist Walter Ritz,^[1] who first devised it. In prewar quantum mechanics texts it was customary to follow the highly influential book by Courant and Hilbert,^[2] who were contemporaries of Ritz and write of the Ritz procedure (Ritzsches Verfahren). It is parenthetically amusing to note that the majority of these old quantum mechanics texts quote the wrong year, 1909 instead of 1908, an error first made in the Courant-Hilbert treatise.

In the numerical analysis literature one usually prefixes the name of Lord Rayleigh to the method, and lately this has become common in quantum mechanics, too. Leissa^[3] recently became intrigued by the name and after reading the original sources discovered that the methods of the two workers differ considerably, although Rayleigh himself believed^[4] that the methods were very similar and that his own method predated the one of Ritz by several decades. However, according to Leissa's convincing conclusion, Rayleigh was mistaken and the method now known as Rayleigh-Ritz method is solely due to Ritz. Leissa states: Therefore, the present writer concludes that Rayleigh’s name should not be attached to the Ritz method; that is, the Rayleigh–Ritz method is an improper designation.

References

↑ W. Ritz, Über eine neue Methode zur Lösung gewisser Variationsprobleme der mathematischen Physik, [On a new method for the solution of certain variational problems of mathematical physics], Journal für reine und angewandte Mathematik vol. 135 pp. 1–61 (1908)
↑ R. Courant and D. Hilbert, Methoden der mathematischen Physik, (two volumes), Springer Verlag, Berlin (1968)
↑ A.W. Leissa, The historical bases of the Rayleigh and Ritz methods, Journal of Sound and Vibration 287, pp. 961–978 (2005).
↑ Lord Rayleigh, On the calculation of Chladni’s figures for a square plate, Philosophical Magazine Sixth Series 22 225–229 (1911)

(To be continued)

[1] W. Ritz, Über eine neue Methode zur Lösung gewisser Variationsprobleme der mathematischen Physik, [On a new method for the solution of certain variational problems of mathematical physics], Journal für reine und angewandte Mathematik vol. 135 pp. 1–61 (1908)

[2] R. Courant and D. Hilbert, Methoden der mathematischen Physik, (two volumes), Springer Verlag, Berlin (1968)

[3] A.W. Leissa, The historical bases of the Rayleigh and Ritz methods, Journal of Sound and Vibration 287, pp. 961–978 (2005).

[4] Lord Rayleigh, On the calculation of Chladni’s figures for a square plate, Philosophical Magazine Sixth Series 22 225–229 (1911)

[1]

[2]

[3]

[4]

@@ Line 1: / Line 1: @@
 {{subpages}}
-In [[quantum mechanics]], the '''Rayleigh-Ritz method''', also known as the '''linear variation method''' is a method to obtain (approximate) solutions of the time-independent [[Schrödinger equation]]. In [[numerical analysis]], it is a method of solving differential equations with boundary conditions. In the latter field it is sometimes called the Rayleigh-Ritz-Galerkin procedure.
+In [[quantum mechanics]], the '''Rayleigh-Ritz method''', also known as the '''linear variation method''' is a method to obtain (approximate) solutions of the time-independent [[Schrödinger equation]]. In [[numerical analysis]], it is a method of solving differential equations with boundary conditions and eigenvalue problems. In the latter field it is sometimes called the Rayleigh-Ritz-Galerkin procedure.
 ==The method==
-The expression
+We will describe the method from a quantum mechanical point of view  and consider as the basic problem to be solved the eigenvalue problem of a Hamilton operator ''H'', which is [[Hermitian]] and contains second derivatives. The results can fairly easily be transferred  to similar non-quantum mechanical eigenvalue problems and certain differential equations. The essential point of the Rayleigh-Ritz method is the rephrasing of the original problem into another, equivalent, problem, which is the determination of stationary point(s) (usually the minimum) of a certain [[functional]].
+So, our purpose is to find an approximate method for solving the eigenvalue equation
+:<math>
+H\Phi = E \Phi\,.
+</math>
+We will show that this equation can be replaced by the problem of finding a stationary point of  the expectation value of  ''H'' with respect to &Phi;
+:<math>
+E[\Phi] \equiv \frac{\langle \Phi | H | \Phi \rangle}{ \langle \Phi |\Phi \rangle}.
+</math>
+Note that this is a map of the function &Phi; onto the number ''E''[&Phi;], i.e., we have here a functional. The [[bra-ket notation]] implies the integration over a configuration space, which in electronic structure theory usually is <math>\scriptstyle \mathbb{R}^{3N} </math> with ''N'' being the number of particles of the system described by the Hamiltonian ''H''. We assume boundary conditions on &Phi; and integration limits such that ''E''[&Phi;] is finite and the Hamiltonian ''H'' is Hermitian. In that case ''E''[&Phi;] is a real number. The boundary condition in electronic structure theory is the vanishing of &Phi; at infinite interparticle distances.  This corresponds to a 3''N'' fold integral over all of <math>\scriptstyle \mathbb{R}^{3N} </math>.  Bounded configuration spaces with periodic boundary conditions also occur in quantum mechanics and variational Rayleigh-Ritz theory, as outlined below,  applies to the latter cases as well.
+===The variational principle===
+Most non-relativistic Hamiltionians in quantum mechanics have a lower bound, that is there is a finite number ''B'' such that
 :<math>
-E[\Phi] \equiv \frac{\langle \Phi | H | \Phi \rangle}{ \langle \Phi |\Phi \rangle}
+E[\Phi] > B \,
 </math>
-is a [[functional]] (maps the function &Phi; onto the number ''E''[&Phi;]). The [[bra-ket notation]] implies the integration over a configuration space, which very often is <math>\scriptstyle \mathbb{R}^{3N} </math> with ''N'' being the number of particles of the system described by the Hamiltonian ''H''. We assume boundary conditions on &Phi; and integration limits such that ''E''[&Phi;] is finite and the Hamiltonian ''H'' is Hermitian. In that case ''E''[&Phi;] is a real number. The boundary condition is very often the vanishing of &Phi; at infinity, together with the integral over all of <math>\scriptstyle \mathbb{R}^{3N} </math>.  Bounded configuration spaces with periodic boundary conditions also occur in quantum mechanics, however.
+for all admissible &Phi;. The function &Phi; is admissible when the functional ''E''[&Phi;] is well-defined (for instance &Phi; must be twice differentiable) and &Phi; has the correct boundary conditions. We assume from now on that ''H'' has a lower bound.
-Most Hamiltionians in quantum mechanics have a lower bound, that is there is a finite number ''E''<sub>0</sub> such that
+The highest lower bound ''B'' is the lowest eigenvalue ''E''<sub>0</sub>  of ''H''. To show this we assume that the eigenvalues and eigenvectors of ''H'' are known and moreover that the eigenvectors are orthonormal and complete (form an orthonormal basis of Hilbert space),
+:<math>
+H\phi_k = E_k \phi_k, \quad E_0 < E_1 \le E_2, \dots
+</math>
+where for convenience sake we assume the ground state to be non-degenerate. Completeness and orthonormality gives the resolution of the identity
+:<math>
+= \sum_{k=0} | \phi_k\rangle \langle \phi_k | .
+</math>
+Inserting this resolution into the functional at appropriate places, and using that the Hamilton matrix on basis of eigenvectors is diagonal with ''E''<sub>''k''</sub> on the diagonal, yields
 :<math>
-E[\Phi] \ge E_0
+E[\Phi] = \frac{\sum_k \langle\Phi|\phi_k\rangle E_k \langle \phi_k |\Phi\rangle}
+{\sum_k \langle\Phi|\phi_k\rangle \langle \phi_k |\Phi\rangle} =
+\frac{\sum_k |c_k|^2  E_k } {\sum_k |c_k|^2} \ge \frac{E_0\sum_k |c_k|^2 } {\sum_k |c_k|^2} = E_0,
 </math>
-for all admissible &Phi;. The function &Phi; is admissible when the functional ''E''[&Phi;] is well-defined and &Phi; has the correct boundary conditions.  We will show that the first variation of this functional leads to the eigenvalue equation
+where <math>\scriptstyle c_k \equiv \langle \phi_k |\Phi\rangle </math>.
+Note further that
 :<math>
-\delta E[\Phi] = 0 \Longleftrightarrow H\Phi = E\Phi \quad \hbox{with} \quad
+E[\phi_0] = E_0\, ,
-E = \frac{\langle \Phi | H | \Phi \rangle}{ \langle \Phi |\Phi \rangle}.
 </math>
-In other words, the function &Phi; that gives a vanishing first variation is a solution of the eigenvalue equation (the time independent Schrödinger equation). Thus, the eigenvalue equation can be replaced by a variational equation. Ritz's idea was that a function, which minimizes the expectation value of ''H'', simultaneously approximates  the lowest eigenvector of ''H''.
+that is,  the lowest possible expectation value of ''H'' is the lowest eigenvalue of of ''H''; this lowest bound is achieved when the expectation value is computed from the  lowest eigenvector. This result is referred to as the ''variational principle''.
-The Ritz method expands the unknown function in a set of admissible functions &chi;<sub> ''i''</sub>, which we first assume to be orthonormal (this can always be achieved by orthonormalizing a given linearly independent set)
+===Matrix eigenvalue problem===
+It will be convenient to rewrite the functional in the the Lagrange form of undetermined multipliers. That is, we redefine the functional as
 :<math>
-\Phi = \sum_{i=1}^n c_i \chi_i
+E[\Phi] = \langle \Phi | H | \Phi \rangle - \lambda \langle \Phi | \Phi \rangle,
 </math>
-Inserting this expansion into the functional makes it a function of the expansion coefficients  and differentiating  ...
+where &lambda; is an undetermined multiplier.  Take now an arbitrary expansion set &chi;<sub>k</sub> and writing
+:<math>
+\Phi = \sum_{k=1}^M  c_k \chi_k
+</math>
+we get upon expansion the expression for the functional
+:<math>
+E[\Phi] = \sum_{k, m=1}^M \left[ c_k^* c_m H_{km} - \lambda c_k^* c_m S_{km} \right]
+= \mathbf{c}^\dagger \mathbf{H} \mathbf{c} - \lambda \mathbf{c}^\dagger \mathbf{S} \mathbf{c}
+</math>
+with the ''M'' &times; ''M'' matrices
+:<math>
+H_{km} \equiv \langle \chi_k | H | \chi_m \rangle, \quad\hbox{and}\quad S_{km} \equiv \langle \chi_k | \chi_m \rangle.
+</math>
+Stationary points&mdash;such as minima&mdash;of the functional are obtained from equating derivatives to zero,
+:<math>
+\frac{\partial E[\Phi]}{\partial c_k^*} = 0 \quad \hbox{and}\quad \frac{\partial E[\Phi]}{\partial c_k} = 0.
+</math>
+Letting ''k'' run from 1 to ''M'', these equations give rise to
+:<math>
+\mathbf{H} \mathbf{c} = \lambda \mathbf{S} \mathbf{c} \quad \hbox{and}\quad
+\mathbf{c}^\dagger \mathbf{H} = \lambda \mathbf{c}^\dagger \mathbf{S},
+</math>
+which, however, are the same equations because the matrices '''H''' and '''S''' are Hermitian and &lambda; is therefore real. Hence, a stationary point of the functional is obtained by solution of
+the ''generalized matrix eigenvalue equation'',
+:<math>
+ \mathbf{H} \mathbf{c} = E \mathbf{S} \mathbf{c} \qquad\qquad\qquad\qquad\qquad\qquad (1)
+</math>
+where we changed the notation from &lambda; to  ''E''.
+Make now two assumptions: the set &chi;<sub>''k''</sub> is complete and the set is orthonormal. (The second assumption is non-essential, but simplifies the notation). In an orthonormal basis  the following simple expression holds for the expansion coefficients,
+:<math>
+c_k = \langle \chi_k | \Phi \rangle
+</math>
+In general completeness of the expansion requires infinite ''M''.
+Introduce the resolution of identity
+:<math>
+=\sum_{k=1}^M |\chi_k\rangle \langle \chi_k|
+</math>
+into
+:<math>
+\sum_k  \langle \chi_m|H|\chi_k\rangle \langle \chi_k|\Phi \rangle =
+E \sum_k  \delta_{mk} \langle \chi_k|\Phi \rangle \quad \Longrightarrow
+\quad \langle \chi_m|H|\Phi \rangle = \langle \chi_m|\Phi \rangle, \quad m=1,2,\ldots, M.
+</math>
+Multiply on the left by <math>\scriptstyle |\chi_m\rangle</math>, sum over ''m'', and use the resolution of the identity again, then we find the result that we set out to prove:
+The determination of stationary  points of the functional ''E''[&Phi;] in a complete basis is equivalent to solving
+:<math>
+H\Phi = E \Phi.
+</math>
+Ritz's important insight was that a function expanded in a non-complete basis &chi;<sub>''k''</sub>
+that minimizes the expectation value of ''H'', gives an  approximation to  the lowest eigenvector of ''H''.
+Reiterating and summarizing the above, the Rayleigh-Ritz variational method starts by choosing an expansion basis &chi;<sub>''k''</sub> of dimension ''M''. Then this expansion is inserted into the expectation value of the Hamiltonian, whereupon the solution of the generalized matrix eigenvalue (1) yields stationary points (usually minima). The lowest eigenvector of this ''M'' dimensional matrix problem approximates the lowest exact solution of the Schrödinger equation. Clearly, the success of the method relies to a large extent on the quality of the expansion basis.
 ==History==
 In the older quantum mechanics literature the method is known as the Ritz method, called after the mathematical physicist [[Walter Ritz]],<ref>W. Ritz, Über eine neue Methode zur Lösung gewisser Variationsprobleme der mathematischen Physik,

Rayleigh-Ritz method: Difference between revisions

Revision as of 06:16, 29 January 2008

Contents

The method

The variational principle

Matrix eigenvalue problem

History

References

Navigation menu

Rayleigh-Ritz method: Difference between revisions

Revision as of 06:16, 29 January 2008

The method

The variational principle

Matrix eigenvalue problem

History

References

Navigation menu

Search