Mail
Contact form
Login
Login

Staracle®
Evolutionary Algorithm Excursion

Hi there, this is an introduction to evolutionary algorithms that I wrote a few years back when I was writing my dissertation at the University of Stuttgart. I still find it quite interesting, and since I put quite some effort into it, I decided to re-publish it here. I hope you enjoy it!

An Introduction to Evolutionary Algorithms

In the past decades, evolutionary algorithms have attracted increasing interest for the solution of optimization problems. In particular, this applies to combinatorial optimization problems, which would otherwise be hard to tackle by exact methods. Note that many combinatorial optimization problems which can be tackled by evolutionary algorithms also allow for other search heuristics, such as simulated annealing, tabu search, or ant colony approaches.
 
Evolutionary algorithms belong to the class of nature-inspired algorithms. They are based on the evolution theory in nature, whose modern foundations were laid by Charles Darwin (1809-1882), but also by other less famous research such as Jean-Baptiste de Lamarck (1744-1829). According to the general understanding of evolution theory, life forms undergo a constant (self-)optimization while trying to adapt to their changing habitat. For example, polar bears have adapted to their cold environment by developing a special kind of fur where every hair functions as a small optical fiber conducting sun light to the skin for heating purposes. This process takes place during a long time period by developing new generations of individuals (also referred to as solutions), where every new generation contains the offsprings of the individuals of the previous generation. All individuals of one generation are referred to as population. Over the course of the generations, the individuals develop new characteristics, which may or may not be favorable for their survival. This leads to one of the basic principles of evolution theory, namely the survival of the fittest. When moving from one generation to the next, a natural selection takes place. Individuals that are well adapted to the environment (having a good fitness) are more likely to survive and reproduce than individuals that are badly adapted (having a bad fitness). The beneficial characteristics of the surviving individuals are passed on to their offsprings. New characteristics are introduced by random mutations that may occur during the reproduction, thus creating offsprings with a better or worse fitness. Over the course of time, this leads to a constant adaptation and optimization of a species to its environment.

Physical Appearance

The physical appearance, properties, and characteristics of an individual are determined by its genetic information. It consists of a certain number of genes, which may take on certain values. The possible values of a gene are referred to as alleles. The genetic information can be regarded as coded information, which is called {\em genotype}. In contrast, the decoded information yields the actual observed physical appearance and characteristics of an individual. This is referred to as phenotype.

The Evolution Process

Several factors govern the course of the evolution process. The {\em mutation} determines how genes are mutated from one generation to the next. A mutation in nature always targets the genotype, i.e., the mutation varies the genes of an offspring. Some of these variations will then alter the phenotype of the new individual, which has an impact on the second important factor, namely the selection. The selection process determines which individuals are chosen to reproduce for the following generation. In nature, it is likely that individuals with a high fitness will mate, but also individuals with a low fitness value have a chance of creating offsprings. The third major factor is the recombination, often referred to as crossover. It determines how the genomes of two individuals are merged during the mating process.
 
Besides these important factors, a large number of other factors influence the evolution. In nature, the phenomenon of co-evolution plays a significant role. It refers to the mutual dependence of different species, such as for example in predator--prey relationships. Furthermore, the inner structure of a population is important. Natural populations are often divided into smaller sub-populations (called demes), separated by rivers or mountain ridges. A limited exchange of genetic information takes place by migration of individuals from one sub-population to the other.
 
It is obvious that the evolution as described above involves a large amount of randomness. This applies, among others, to the mutation, the crossover, and the selection. Therefore, the offsprings of a certain generation are random to a certain degree. However, this randomness is channelized due to the inheritance of features from one generation to the next. Colloquially, one could therefore describe the evolution process as structured probing of the possible solution space.

Appliction to Real World Problems

When applying evolutionary algorithms to real-world optimization problems, several tasks have to be performed. First, a suitable representation of real-world solutions has to be found. This goes along with finding a method for determining the fitness of an individual, where the fitness is usually a real number. Furthermore, the {\em genetic operators} have to be defined, such as selection, mutation, and crossover. Alike, the management of the population has to be decided on. Other considerations may be necessary, depending on the problem and the particular instance of the evolutionary algorithm.

The Flavours of Evolutionary Algorithms

Within the overall field of evolutionary algorithms, three main schools of thought have developed in the early years. These are genetic algorithms, evolution strategies, and evolutionary programming. Initially, these schools were developed independently by relatively small research groups with no significant exchange among them. This has changed since the 1980ies, during which the individual schools were united under the all-embracing term of evolutionary algorithms. Since then, the differences have become somewhat diffuse. Interestingly, this can be regarded as one example of sub-populations and migration. In the following, the three main streams are introduced and characterized.
Classification of evolutionary algorithms
Genetic Algorithms (GAs) date back to the works of John Holland in the 1960ies. Classically, the representation of real-world solutions is done by means of bit strings, i.e., a binary solution coding is used. The bit strings are the genomes of the individuals, hence GAs operate on the level of the genotype. The representation is not limited to bit strings, and many applications suggest a more "natural" representation, for example by arrays of integer or real numbers. There are several standard genetic operators that are more or less suitable depending on the specific problem. Selection is done in a non-extinctive way, meaning that all individuals of a generation have the chance of producing offsprings. The main search operator to traverse the possible solution space is the crossover operator. Mutation is only a background operator whose major responsibility is to prevent the algorithm from being trapped in a local optimum.
 
A special form of genetic algorithms is genetic programming. It is used for program induction, which describes the process of automatic computer program generation. The induction is based on given training data, and the program shall solve a particular problem. The initial population is a set of randomly generated programs. The original work by John Koza is based on the programming language LISP, which has some favorable properties for genetic programming. For example, John Koza evolved a LISP program to generate random numbers. However, genetic programming was also used in conjunction with more common programming languages. Nordin and his fellow-researchers in 1994 used this technique. Machine code programs were automatically generated which were able to distinguish between Swedish nouns and verbs based on their spelling. They showed the superior performance of their approach compared to neural networks.
In contrast to genetic algorithms, which operate on the level of the genotype, Evolution Strategies operate on the level of the phenotype. Evolution strategies thus have a higher degree of abstraction and are often used to optimize continuous decision variables. The representation of solutions is always an array of real numbers. In contrast to genetic algorithms, the selection is deterministic and extinctive, meaning that individuals with a bad fitness value have no chance of producing offsprings. The main search operator is the mutation, which is done by adding normal distributed random numbers to the solution array elements. The standard deviation of the random numbers can be adjusted adaptively during the run time of the algorithm. Besides the mutation operation, the crossover is also used as a second important operator.
Originally, Evolutionary Programming was used to generate artificially intelligent finite state machines. These state machines were used for the prediction of time series. Given a series of elements from a finite alphabet, they should predict the next element. The only search operator is mutation, crossovers are not considered. The mutation step-size, which is a real-numbered random variable like in evolution strategies, depends on the parent's fitness value. As the algorithm approaches the optimum, the mutation step-size becomes smaller. %Self-adaptation may %be introduced by calculating the mutation step-size based on a vector %of adapting standard deviations for each individual. The selection process is stochastic and extinctive.

Summary

As mentioned before, the borderlines between these originally separate schools of thinking have become diffuse. Still, there are some distinct features characterizing each school. However, when applying an evolutionary algorithm to a particular real world problem, it is not beneficial to focus on only one of them. Modular design principle have been proposed, where specific evolutionary algorithms are created by taking building blocks from possibly different schools. These building blocks comprise population models, solution codings, concepts for handling constraints, search operators, selection operators, and modules for statistical evaluation.

References

Further references will be added later.