Neuroevolution

We have already talked about evolutionary algorithms and neural networks. Today’s topic is neuroevolution, i.e. the evolution of neural networks. This topic is currently becoming very relevant again due to the development of methods based on deep learning. Algorithms for neuroevolution have recently been extended to algorithms for finding neural network architectures.

The field of neuroevolution can be divided into three subfields according to what the algorithms evolve - the evolution of network weights, the evolution of topology, and the evolution of both at the same time.

Evolution of network weights

The evolution of neural network weights is the simplest from the point of view of coding an individual. We assume here that the architecture of the neural network is fixed, i.e. the number of its weights is also fixed and the entire neuroevolution is reduced to a problem of continuous optimization. Most often, simple evolutionary strategies are used in this area - a random number from a normal distribution with zero mean and small variance (which is the same for all weights) is added to each gene of an individual. More complex evolutionary strategies are typically not used much because their overhead for weight adaptation can be quite large.

Why use evolution instead of gradient methods? Gradient methods are often faster for supervised learning, e.g. for training networks for classification. But evolution has several significant advantages for reinforcement learning. One of them is significantly better parallelization and the possibility to use more CPU cores - in reinforcement learning, the main problem is the evaluation of the environment, which can take a relatively long time compared to the time needed to calculate the gradient and propagate the weights. Additionally, this evaluation often cannot be done on a GPU. Evolutionary strategies can be parallelized very well, and algorithms where evolutionary strategies run on hundreds of CPUs are no exception.

The second area where evolutionary strategies have the upper hand is in environments with sparse rewards - imagine a game where you only get a score at the end and not at every step along the way. In such a game, common deep reinforcement learning algorithms run into the problem that their reward estimates are propagated very slowly (for all but the last state, the estimate is 0 after the first playthrough of the game). Evolutionary strategies don’t mind this - they only use fitness at the end anyway. Sometimes evolutionary strategies are even combined with deep reinforcement learning algorithms so that the games played in evolution are used to train the networks in the deep reinforcement learning algorithms.

Evolution of weights and topology

A typical algorithm that evolves the weights of the network and its topology at the same time is NEAT (Neuro-evolution of Augmenting Topologies). The NEAT algorithm is somewhat similar to Cartesian genetic programming algorithms. Its individuals are made up of a list, which contains in each item information about one weight in the neural network - which neurons it connects, and what its value is. In addition, each weight has a flag if this connection in the network is active or not and also its “innovation number” - an identifier that aims to find out which weights in two different individuals have the same history and therefore should have a similar function. This is used within genetic operators.

NEAT uses two types of mutations - adding a neuron and adding an edge. In both cases, when an edge is added, a new innovation number is generated for each new edge. Adding an edge adds a new edge to the end of an individual, adding a neuron selects an edge in the individual, splits it into two by adding a neuron, adds new edges to the end of the individual, and sets the original edge to inactive.

Crossover in NEAT uses innovation numbers. The parents in the crossover are “aligned” according to innovation numbers. Edges with the same innovation number that are in both individuals are randomly inherited from one or the other. Edges that are only in one individual are inherited from the better of the two. If an edge is inactive in one individual and active in another, the result is active/inactive with a given probability.

In addition to these operators, NEAT uses another interesting technique of dividing the population into multiple species. Individuals of each species then explicitly share fitness, i.e. the fitness of each individual is divided by the number of individuals of the same species, thus giving newly discovered ideas (structures) time to be refined using genetic operators. Species are defined using the distance between individuals. This is calculated based on the number of the same and different genes (according to innovation numbers) and according to the similarity of the genes that are in both individuals.

At initialization, NEAT starts from minimal structures, i.e. neural networks that do not contain any hidden neurons - all connections are only between inputs and outputs.

HyperNEAT

The hyperNEAT algorithm is an extension of the NEAT algorithm. In the HyperNEAT algorithm, the topology of the network is fixed at the beginning (typically as a hyper-cube) and the weights are represented using another neural network. This neural network receives as input the coordinates of the neurons in the resulting network and returns weights for them directly. The network for calculating the weights is then created using the NEAT algorithm.

The advantage of the HyperNEAT algorithm is that it creates much more regular networks than NEAT, which can be closer to biological neural networks, and it is also capable of evolving larger networks.

Algorithms for finding the entire structure of neural networks, which are later trained using common gradient techniques, are currently very popular. From the point of view of evolutionary algorithms, in such a case it is necessary to encode the structure of the network (at the level of layers or blocks of neurons) into an individual. If we wanted to create only layers, the situation is quite simple - an individual can be, for example, a sequence of these layers including their parameters. Operators are then adjustments to the parameters of these layers, addition of additional layers, or even addition of connections between layers that are not adjacent to each other (skip connections).

But there are also algorithms like CoDeepNEAT that encode individuals similarly to NEAT. But the difference is that the individual nodes do not contain individual neurons, but directly modules of the neural network (blocks of neurons). These modules are also created using the NEAT algorithm. When evaluating fitness, the final individual is created using a combination of these modules, as described in the individual.

The Novelty Search algorithm is not related only to the evolution of neural networks, but is more general. However, it is very often used in connection with neuroevolution. The novelty search principle is the omission of a fitness function that would directly evaluate the quality of the solutions found. This fitness function is replaced by a function that compares the behavior of the created individuals (e.g., when searching for a path in a maze, the distance of the positions reached by the individual from the positions reached by the other individuals can be calculated). It may seem that such an algorithm will not work very well, but it turns out that this is not true. Especially in environments where the classic fitness function contains complex local optima, it turns out that novelty search gives good results in terms of the quality of the solutions found.