Nonlinear Opt.: KKT- theorem 4: Unterschied zwischen den Versionen
[unmarkierte Version] | [unmarkierte Version] |
Siby (Diskussion | Beiträge) (Die Seite wurde neu angelegt: „theorie“) |
|||
Zeile 1: | Zeile 1: | ||
− | [[ | + | '''Karush-Kuhn-Tucker Theorem''' |
+ | |||
+ | == Introduction == | ||
+ | The Karush-Kuhn-Tucker (KKT) Theorem is a model of nonlinear optimization (NLP). The model is based on the Langrangian optimization, but considers inequality as part of the KKT constraints. The approach proofs the optimality of a (given) point concerning a nonlinear objective function. The satisfaction of KKT constraint is a necessary condition for a solution being optimal in NLP. | ||
+ | |||
+ | == KKT Conditions == | ||
+ | The six KKT condition are based on the Langrangian function of a random maximization (or minimization) problem. | ||
+ | |||
+ | <math>L(x,\lambda) = f(x_1 , ... , x_n ) - \sum_{i=0} ^k \lambda_i * g_i (x_1,...,x_n) </math>, where <math> f(x_1,...,x_n) </math> is the objective function and <math>g_i (x_1,...,x_n) = 0</math> are the constraints. | ||
+ | |||
+ | The algebraic sign of <math> \sum_{i=0} ^k \lambda_i * g_i (x_1,...,x_n)</math> depends on the state of the problem, for a maximization problem it is “-” for minimization “+“. | ||
+ | |||
+ | |||
+ | The first KKT Condition is the derivative of <math>L (x,\lambda)</math> after all <math>x_i</math>, which is equal to the gradient of <math>L (x,\lambda)</math>, <math>\nabla L (x,\lambda)</math> and it includes the inequality: | ||
+ | |||
+ | <math>1) \frac {\partial L}{\partial x_i} = \frac {\partial f}{\partial x _i} - \sum_{i=0} ^j * \frac {\partial g_i}{\partial x_i} \leq 0 </math> | ||
+ | |||
+ | If the objective function has to be minimized instead, only the sign of <math>\frac {\partial f}{\partial x_i} </math> changes, because <math>\max f(x) </math> = <math> \min</math><math> -(f(x)) </math>. The reason therefore is easy to see,when reflecting the objective function on the abcissa,which is clarified by the graphic below. The restrictions (in this case <math>x \le 1</math> ) don’t change. | ||
+ | [[Datei:Graph_constraint.jpg]] | ||
+ | |||
+ | |||
+ | The second KKT constraint ensures, any x can not be less than 0, which is basically the non-negativity constraint. | ||
+ | |||
+ | <math>2) x_i * \frac {\partial L}{\partial x_i} = 0</math> | ||
+ | |||
+ | The third and fourth constraints are analog to the first and the second one, but this time for <math>\lambda </math>. | ||
+ | |||
+ | <math>3) \frac {\partial L} {\partial \lambda_j} \leq 0</math> | ||
+ | |||
+ | Analogue to the previous changes when one face a minimization problem, the sign of <math>g_i</math> changes. | ||
+ | |||
+ | |||
+ | <math>4) \lambda_j * \frac {\partial L}{\partial \lambda_j} = 0</math> | ||
+ | |||
+ | |||
+ | And finally the non-negativity constraints: | ||
+ | |||
+ | <math> 5) x_i \geq 0 , i=1,...,n </math> | ||
+ | |||
+ | <math>6) \lambda_j \geq 0 , j = 1,...,k </math> | ||
+ | |||
+ | == Examples == | ||
+ | Maximize the objective function <math>f(x_1,x_2) = x{_1}^2 + x{_2}^2 + x_2 -1 </math> under the constraints <math>h(x_1,x_2) = x{_1}^2 + x{_2}^2 \leq 1</math>. | ||
+ | The corresponding Langrangian function would be: | ||
+ | <math>L (x_1,x_2,\lambda) = x{_1}^2 + x{_2}^2 + x_2 - 1 - \lambda x{_1}^2 + x{_2}^2 - 1</math> | ||
+ | |||
+ | <math>1) | ||
+ | \begin{bmatrix} | ||
+ | 2x_1 - \lambda 2x_1\\ | ||
+ | 2x_2 +1 - \lambda 2x_2 | ||
+ | \end{bmatrix} \leq 0 | ||
+ | </math> | ||
+ | |||
+ | <math>2) x_1*( 2x_1 - \lambda 2x_1) = 0 | ||
+ | x_2*(2x_2 +1 - \lambda 2x_2) = 0 | ||
+ | </math> | ||
+ | |||
+ | <math>3) | ||
+ | x{_1}^2 + x{_2}^2 - 1 \leq 0 | ||
+ | </math> | ||
+ | |||
+ | <math>4) | ||
+ | \lambda * (x{_1}^2 + x{_2}^2 - 1) = 0 | ||
+ | </math> | ||
+ | |||
+ | <math>5) | ||
+ | x_1,x_2 \geq 0 | ||
+ | </math> | ||
+ | |||
+ | <math>6) | ||
+ | \lambda \geq 0 | ||
+ | </math> | ||
+ | |||
+ | <math>2)</math> might have two solutions. Either <math>x_1=0</math> or <math>\lambda = 1</math>. | ||
+ | Set <math>\lambda = 1</math> , then the other formula in <math>2)</math> would be 1=0 which is obviously wrong. So <math>x_1</math> must be 0. | ||
+ | |||
+ | Set <math>x_1=0</math> in <math>4)</math>, then <math>\lambda (x{_2}^2 - 1) = 0</math>. This gives us two possible solutions, <math>\lambda = 0</math> or <math>x_2 = \pm 1</math>. | ||
+ | |||
+ | Case 1: <math>x_2= 1</math> | ||
+ | Put this in 2), this leads to <math>\lambda = \frac{3}{2}</math>. | ||
+ | |||
+ | Case 2: <math>x_2= -1</math> | ||
+ | This solution doesn't satisfy restriction 5). | ||
+ | |||
+ | So there is only one solution, that satisfy all restrictions, which makes it automatically optimal: | ||
+ | <math>x_1=0 , x_2 = 1</math> and <math>\lambda = \frac{3}{2}</math> which results in <math>f(0,1) = 1</math>. | ||
+ | |||
+ | |||
+ | == Further Information == | ||
+ | If <math>f(x_1,...,x_n)</math> is concave and all <math>h_i(x_1,...,x_n)</math> are convex, the satisfaction of all KKT conditions is necessary and sufficient for a global optimum, if the Slater condition is satisfied to. This means, there exist an <math>x \in \R_{+}^n</math>, so that <math>g(x)<o</math> which means there is a point inside the solution space. Node that <math>h_i(x_1,...,x_n)</math> is the original constraint, but to make things easy those restrictions are changed from inequality functions into quality functions ( <math>g_i(x_1,...,x_n)</math>). |
Aktuelle Version vom 1. Juli 2013, 02:18 Uhr
Karush-Kuhn-Tucker Theorem
Inhaltsverzeichnis
Introduction
The Karush-Kuhn-Tucker (KKT) Theorem is a model of nonlinear optimization (NLP). The model is based on the Langrangian optimization, but considers inequality as part of the KKT constraints. The approach proofs the optimality of a (given) point concerning a nonlinear objective function. The satisfaction of KKT constraint is a necessary condition for a solution being optimal in NLP.
KKT Conditions
The six KKT condition are based on the Langrangian function of a random maximization (or minimization) problem.
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): L(x,\lambda) = f(x_1 , ... , x_n ) - \sum_{i=0} ^k \lambda_i * g_i (x_1,...,x_n) , where is the objective function and are the constraints.
The algebraic sign of Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): \sum_{i=0} ^k \lambda_i * g_i (x_1,...,x_n)
depends on the state of the problem, for a maximization problem it is “-” for minimization “+“.
The first KKT Condition is the derivative of after all , which is equal to the gradient of , Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): \nabla L (x,\lambda)
and it includes the inequality:
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 1) \frac {\partial L}{\partial x_i} = \frac {\partial f}{\partial x _i} - \sum_{i=0} ^j * \frac {\partial g_i}{\partial x_i} \leq 0
If the objective function has to be minimized instead, only the sign of Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): \frac {\partial f}{\partial x_i}
changes, because = Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): -(f(x))
. The reason therefore is easy to see,when reflecting the objective function on the abcissa,which is clarified by the graphic below. The restrictions (in this case Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): x \le 1
) don’t change.
The second KKT constraint ensures, any x can not be less than 0, which is basically the non-negativity constraint.
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 2) x_i * \frac {\partial L}{\partial x_i} = 0
The third and fourth constraints are analog to the first and the second one, but this time for .
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 3) \frac {\partial L} {\partial \lambda_j} \leq 0
Analogue to the previous changes when one face a minimization problem, the sign of changes.
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 4) \lambda_j * \frac {\partial L}{\partial \lambda_j} = 0
And finally the non-negativity constraints:
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 5) x_i \geq 0 , i=1,...,n
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 6) \lambda_j \geq 0 , j = 1,...,k
Examples
Maximize the objective function Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): f(x_1,x_2) = x{_1}^2 + x{_2}^2 + x_2 -1
under the constraints Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): h(x_1,x_2) = x{_1}^2 + x{_2}^2 \leq 1
. The corresponding Langrangian function would be: Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): L (x_1,x_2,\lambda) = x{_1}^2 + x{_2}^2 + x_2 - 1 - \lambda x{_1}^2 + x{_2}^2 - 1
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 1) \begin{bmatrix} 2x_1 - \lambda 2x_1\\ 2x_2 +1 - \lambda 2x_2 \end{bmatrix} \leq 0
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 2) x_1*( 2x_1 - \lambda 2x_1) = 0 x_2*(2x_2 +1 - \lambda 2x_2) = 0
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 3) x{_1}^2 + x{_2}^2 - 1 \leq 0
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 4) \lambda * (x{_1}^2 + x{_2}^2 - 1) = 0
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 5) x_1,x_2 \geq 0
Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): 6) \lambda \geq 0
might have two solutions. Either or .
Set , then the other formula in would be 1=0 which is obviously wrong. So must be 0.
Set in , then Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): \lambda (x{_2}^2 - 1) = 0 . This gives us two possible solutions, or .
Case 1: Put this in 2), this leads to .
Case 2: Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): x_2= -1
This solution doesn't satisfy restriction 5).
So there is only one solution, that satisfy all restrictions, which makes it automatically optimal: and which results in .
Further Information
If is concave and all are convex, the satisfaction of all KKT conditions is necessary and sufficient for a global optimum, if the Slater condition is satisfied to. This means, there exist an Fehler beim Parsen (http://mathoid.testme.wmflabs.org Serverantwort ist ungültiges JSON.): x \in \R_{+}^n , so that which means there is a point inside the solution space. Node that is the original constraint, but to make things easy those restrictions are changed from inequality functions into quality functions ( ).