Nonlinear Opt.: Classical gradient method 1

Aus Operations-Research-Wiki
Wechseln zu: Navigation, Suche

Theory

The Gradient method is a common approach to solve nonlinear optimization models. The gradient vector of a maximizing objective function gives the direction of the steepest ascent of at the point . Gradient for maximization problems moving iteratively, starting from a feasible solution , so long in direction, until they reach a point , where the function is no longer rising or the edge of the solution space is reached. There, the next iteration starts with the calculation of the gradient. If he leads out of the permissible range, an allowable increase-direction is determined instead. When we have a convex optimization model, the optimal solution is achieved by this approach to arbitrary precision; otherwise the gradient-method usually provides only local optimum.

Example

We are now going to discuss a complete numerical example, of how to use the non-linear gradient method in optimization. Suppose there's given a function , that has two variables and and that function is given as . Use the point as the initital of the optimal solution. The first thing that we have to do is to come up with the initial value. We call them the initial values and .

Iteration 1:

To make the gradient, the partial derivates must be calculated with By calculating the partial derivatives of and you get , is a constant and , is a constant. So the partial derivative of from and together will give you the gradient of the function; when inserting the values and , we get Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): \nabla f= 6i+2j


To go on, we have to decide in which direction we process and with which length. Our gradient will give us the direction with the greatest rate of increase, but as we are looking for a decrease (minimization) we have to use the negative gradient.

So our formula looks like this:

Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): x^{k+1}=x^k-\alpha_k \cdot \nabla f

(with being the iteration, the length of our vector and the gradient gives the direction.)

At the first step our is zero and therefore our and , which is our initial value, is used as . Our gradient at this point is Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): \nabla f=\binom{6}{2} . By inserting we get: Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): x^1=\binom{2}{1}-\alpha_k \binom{6}{2}=\binom{2-6\alpha_k}{1-2\alpha_k}

To get our length we insert those terms in our . We get: Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 40\alpha_k^2-40\alpha_k+13

As we are looking for a minimal point, we make the derivative of that function and set it equal zero. Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 80\alpha_k -40=0

This leads us to our length . When we now put all those pieces together, we get our new point Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): x^1=\binom{2-\frac{1}{2}\cdot 6}{1-\frac{1}{2}\cdot 2}=\binom{-1}{0} . This was our first iteration now. To realize, if we are already at our final point,which is our minimum, we can easily make the 'new' gradient with our new point . By doing this, we see that the gradient is zero and we are already at the end of the process. If this would not be the case, we'd have to process another iteration:

Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): x^2=\binom{-1}{0}-\alpha_k\binom{new}{gradient}


When the gradient gets zero, we know that the solution is already optimal.

Sources

Wendt, Prof. Dr. Oliver: Spript Operations Research, University of Kaiserslautern, Summer Semester 2013

Ross A. Lippert, D. E. Shaw Research

Nonlinear Programming FAQ

Bertsekas, Dimitri P. (1999). Nonlinear Programming: 2nd Edition. Athena Scientific. ISBN 1-886529-00-0.