This is often the definition of convexity. It you are being asked to prove it, though, you are using a different definition. Which one are you using? In particular, do you assume your function is continuus? Is your definition what you get if you put t=1/2 in the above formula?
2007-03-26 00:40:23
·
answer #1
·
answered by mathematician 7
·
1⤊
0⤋
The idea behind a convex function is that the region on or above the graph (which is called the epigraph) must be a convex set, meaning that we should be able to take any two points in the epigraph and connect them with a straight line segment without leaving the epigraph. Now, suppose we choose any two points in the epigraph (call them (x_1, y_1) and (x_2, y_2)). Since these points are on or above the graph, y_1â¥f(x_1) and y_2â¥f(x_2). It is easy to see (and not too difficult to prove formally, just parameterize the segments) that if we reduce the y-coordinate of (x_1, y_1) to (x_1, f(x_1)) and likewise move (x_2, y_2) down to (x_2, f(x_2)), then every point on the segment from (x_1, f(x_1)) to (x_2, f(x_2)) will be lower than or equal to than the point with the same x-coordinate on the segment from (x_1, y_1) to (x_2, y_2) (strictly lower, in fact, unless they share an endpoint, and even then every other point will be lower unless they are the same segment), so if every point on the segment from (x_1, f(x_1)) to (x_2, f(x_2)) is in the set of points on or above the function, so too must every point on the segment from (x_1, f(x_1)) to (x_2, f(x_2)). Thus, if every segment between two points of the graph lies entirely within the epigraph, so too must every segment between any two points of the epigraph. The converse is also true, since points of the graph are part of the epigraph. Therefore, a function is convex iff every segment between two points of the graph lies entirely within the epigraph.
To see how this relates to the definition you are given, let us imagine how we might parameterize the segment between (x_1, f(x_1)) and (x_2, f(x_2)). Choosing a standard interval [0, 1] for the parameterization, we could represent this line segment as:
x=x_1 + (x_2-x_1)t
y=f(x_1) + (f(x_2) - f(x_1))t
Now, the defining characteristic is that every point on this line segment must be greater than or equal to the value of the function itself. So yâ¥f(x). This means:
f(x_1 + (x_2-x_1)t) ⤠f(x_1) + (f(x_2) - f(x_1))t
The function is therefore convex if and only if this relation holds for every choice of x_1, x_2, and t. A little algebra shows this is equivalent to:
f(t x_2 + (1 - t) x_1) ⤠t f(x_2) + (1 - t) f(x_1)
Now, which names we assign to the variables are arbitrary, so renaming x_2 as x and x_1 as y, we obtain the formula you posted (minus the typo near the end):
f(t x + (1 - t) y) ⤠t f(x) + (1 - t) f(y) for all x, y, t (t in [0, 1])
Well, okay, it's not _quite_ equivalent. Aside from the typo, your formula also has t in (0, 1) instead of [0, 1] and the inequality is strict. This is actually a stronger condition than is needed to ensure the convexity of the epigraph. If the stronger condition, that is f(t x_2 + (1 - t) x_1) < t f(x_2) + (1 - t) f(x_1) for all x_1, x_2 in the domain of the function and t in (0, 1), then the function is called strictly convex.
Edit: OP was having difficulty reading the subscripts, so I replaced them with underscores.
2007-03-26 08:29:06
·
answer #2
·
answered by Pascal 7
·
0⤊
1⤋