Heuristic Optimization Algorithms- Part II

Since now we know optimization algorithms are in fact search algorithms, then why don’t we calculate all possible values that the function can take and simply search for the maximum or minimum? The thing is, it is a good enough approach if your input space (Domain) is limited or only has a few points.

What if the domain was the entire 2-D space which is infinite or if the domain size was too large. Calculating values at all possible inputs is certainly not feasible here.

But do we need to search the entire space in such cases?

Why don’t we first consider an example

What is the minima of the function x^2-5

Now wait, before you skip this blog regarding it as fairly simple and not intellectually challenging enough. I want you to write down the steps you took before you just simply screamed out the answer in your mind

The normal calculus method for calculating this value dictates that we should take the following steps

  1. Differentiate the function with respect to the unknown
  2. Equate this to zero to get a value of x.
  3. Substitute this back in the function to get the value of the optimum (minima/maxima)

So what have we done here? Now obviously the 1-D space (which is the domain of this function) certainly has an infinite number of points. Why didn’t we consider all of them?

The reason is that we have used a certain property of the function which enables us to this search set from infinite points to just a few points. Those few points being the inputs x where the gradient is parallel to x-axis or in other words the gradient is zero.

But there are a few things we took for granted while performing these steps

  1. We either knew the gradient of this function or it was easy to calculate. But what if it wasn’t just a single variable or two variables? What if it had many unknowns and lots of terms?
  2. Well you actually knew the function for which you wanted to find the optima. In not all real life applications would you be able to ‘logically’ assign a function or some equation to what you want to optimize.

Okay so this last point may not make much sense to you right now, but try picking up or searching for real life applications where optimization heuristics such as swarm algorithms or evolutionary algorithms have been used, and you will see that in not all cases were they “explicitly” optimizing a mathematical function. Of course there is some mathematical formula underlying the whole thing but it is not readily visible.

So what would you do in such cases? I know what you are thinking -There are so many techniques and tools out there to solve these complex functions, why bother.

Well you are right, in case you don’t worry much about the cost and time they would use up. These are precisely the correct/classical approaches that we spoke about in the last post.

Of course there are tools out there but if they were perfect, the world would be a happier place.

Sometimes solving complex functions would involve solving ordinary or partial differential equations which take up a lot of time. This is assuming there is a correct/classical approach for solving the function. Sometimes these tools themselves employ some heuristics so we can’t argue it’s the correct approach at all.

Still not convinced? Try minimizing this function then (by hand…*evil laugh)

Capture

Recommended variable values are: a = 20, b = 0.2 and c = 2π. (d is the number of dimensions)

So let’s not digress anymore. What if I told you, you do not need to know the gradient of the function? And when I say you don’t need the gradient, I also mean that you don’t need to know the function at all, you can just make do with input-value pairs. So the exact formulation need not be known, just the function values at the inputs are needed.

Heuristic optimization algorithms have a lot to offer in this regard. Some Heuristic Optimization algorithms such as evolutionary algorithms or swarm algorithms can perform this optimization (Search) without even knowing the gradient of the function.

Optimization Heuristics ensure that you would arrive at your answer, though not so accurate but certainly in a much faster way. It is a speed to accuracy trade-off you are willing to take. You do not want the exact answer in most cases but something close enough would do.

Heuristic itself means something that may not be perfect but good enough for immediate goals. So mostly such algorithms go without a proof of correctness. But in most cases, with sufficient restrictions on the search space, these algorithms perform very well.

Yes in case you want a perfectly accurate answer and computation time and cost is of no value to you, then there are many linear and non-linear optimization techniques available which employ complex mathematical structures. But as I said, most application simply can’t afford the computation time or do not require such accuracy.

For e.g.- your Bio-metric face recognition app has to optimize the matching scores before it decides whether you are the ‘genuine’ face or not. Are you willing to spend an hour before gaining access?

So from this post we have taken care of the following ideas

  1. Difference between a classical approach and a heuristic approach.
  2. When a classical approach or why a classical approach is in-feasible.
  3. When to use a heuristic optimization algorithm and what trade offs we are wiling to make

Lots of research has been done in this field and it will surely continue to be a hot topic in mathematics. In the next post, we will talk about some of this research and discuss some assumptions we make while using these heuristic algorithms.

And by the way, Particle Swarm Optimization takes a second or less to calculate the minimum of that function in the image.

Heuristic Optimization Algorithms-(Basics)

In computer science, artificial intelligence, and mathematical optimization, a heuristic is a technique designed for solving a problem more quickly when classic methods are too slow, or for finding an approximate solution when classic methods fail to find any exact solution. This is achieved by trading optimality, completeness, accuracy, or precision for speed. In a way, it can be considered a shortcut.

-Wikipedia

I certainly can’t give a better definition than Wikipedia for the word heuristics. In essence an approach or a method is called “heuristic” if it seems to be optimal or good enough and will give you a close enough solution to your problem but you cannot in any way prove that it gives you the best possible solution every time.

The plus side is that it is faster than the correct/classical approach and the down side being it is not as accurate as the correct/classical approach. In many cases a correct/classical approach may not even exist.

The aim of this post is to introduce you to the field of heuristic algorithms which in a crude way are algorithms which are used to optimize i.e. finding minimum/maximum of a function.

Each function has a domain and range. Domain is the set of all possible inputs for the function. Range is the set of all values the function can take.Now there can only be one maximum or minimum value in this range.

So now we can say that this minima/maxima (lets just call it optima from now on) value occurs at only certain point(s) of the domain. I mean the function takes the optimum value for only a certain number of (subset of) all possible inputs which are given in the domain set. In case this was a one-one function there would only be one point corresponding to the optimum. Otherwise there can be many.

So now that we have established this relation between the optima of the function and the domain of the function, does that give you a hint on how to optimize a function?. Surely if in case you had a small number of possible inputs i.e. you had a small domain space, you would’ve just calculated all possible values of the function and then reported the maximum or minimum of that set.

So in essence, this optimization problem can be called a search problem. You calculate all possible values and then you search the maximum/minimum of that set and then call it the optima.

This characterization of optimization problems as essentially search problems is crucial. All types of problems which in a simple way can be written as “Find out the best value from the set { }” are essentially search problems.

Now how you approach this search problem is a different matter altogether. But since the optimum has to occur somewhere in the function space you are in fact searching for it.

Now with this post, you have an idea of

  1. What a heuristic algorithm is
  2. What a heuristic optimization algorithm is
  3. Optimization problems are essentially search problems.

With these ideas now in our arsenal we can move onto bigger and better ideas. In the next post we will delve further into heuristic optimization algorithms and try to understand them from the perspective of computation ease, real life applications etc.

If you find this post insightful then please do press the like button and the comment section below is open for any queries or reviews. Or you can also e-mail me at sai.virus@gmail.com