As discussed briefly in Section 4.1, the problem we are facing when
searching for stationary values of the action given in equation (4.1) is
optimization; i.e. trying to locate the stationary values. In literature treating the
AVP, only minimum and saddle points have been searched for, indicating that there are no
maximum points for the action. I will therefore only discuss minimum (both global and
local) and saddle points in this thesis. The choice of optimizing methods is a
reflection of this; with the exception of the Newton-Raphson method, all of the methods
search for stationary values in a *descent direction*. This automatically excludes
any possible maximum points. The Newton-Raphson method is capable of locating any
stationary point, but, as we will see in the next chapter, there will solely be minimum
and saddle points that are solutions to the action.

A general description of several optimizing methods will be given in this section, while in the next chapter, they will be evaluated by applying them to a system galaxies. The methods will be judged on the ground of effectiveness and ability to locate different solutions. The reason for doing this rather detailed testing is simply that there, as of today, is no method that is known to be better than any other for general use, and that it is hard to say from just looking at the problem which method is best suitable. A description of the methods presented can be found in textbooks like Press,Walsh,DS,GMW,Bert.

In the general treatment given in the remaining parts of this chapter, the function for which we try to find the minima are referred to as , where are the unknown variables. An unconstrained optimization procedure starts by choosing a starting point; i.e. an initial guess for the values of the unknown parameters in , . A substantial amount of computing time can be saved, and the possibility of actually finding minima increased, by choosing with some care. This means in practice using whatever information available on the behavior of , so that our initial guess is not too far from the stationary point(s).

Once the initial point is chosen, we must make two decisions before the next point can be generated: (i) we must first pick a direction along which the next point is to be chosen, and (ii) we must decide on a step size to be taken in that chosen direction. We then have the following iterative picture

where is the direction and is the step
size. The different optimization methods presented differ in the choice of
and . Roughly speaking, we can classify these methods into three
categories:

The two last categories are often also classified as gradient methods. In choosing a method, accuracy and especially computing time are of the essence. Category (1) will not be considered here; it is generally accepted that one should make use of first-order derivatives if they are available and not too elaborate to calculate. Category (3) will generally generate points that are sufficiently close to the minimum point in the least number of steps, but that does not necessarily mean that it is the most efficient. The computational costs of computing and handling the second derivatives can be so substantial, that the evaluation of the first derivative alone at several points would take less computing time. Methods of category (2) would then be a preferable choice.

For illustrative purposes, a quadratic function will be used in the presentation of these
optimizing methods, much due to its simplicity, but mainly because several of these
methods where originally designed to solve problems equivalent to minimizing a quadratic
function. The *n*-dimensional quadratic functions used here is on the form

where is a positive definite matrix, and are vectors, and *c* is a scalar constant. The
illustrations presented are 2-dimensional representation of the quadratic, where the
contours represent where the function values are the same.

Let us now take a closer look at some of these optimizing methods, starting with category (2).

Mon Jul 5 02:59:28 MET DST 1999