While, at present,Â deep learningÂ (DL) is on top in terms of both application and employability, it has close competition with evolutionary algorithms. These algorithms are inspired by the natural process of evolution, the world’s best optimizers. In this article, we will explore what is a genetic algorithm, advantages of genetic algorithms, and various uses of genetic algorithm in optimizing your models.
This article is an excerpt taken from the bookÂ ‘HandsOn Artificial Intelligence for IoT’Â written byÂ Amita Kapoor.Â The book explores building smarter systems by combining artificial intelligence and the Internet of Thingsâ€”two of the most talked about topics today.
Let’s now learn how can we implement the genetic algorithm. Genetic AlgorithmÂ was developed by John Holland in 1975. It was shown that it can be used to solve an optimization problem by his student Goldberg, who usedÂ genetic algorithms to control gas pipeline transmission. Since then,Â genetic algorithms have remained popular, and have inspired various other evolutionary programs.
To apply genetic algorithms in solving optimization problems using the computer, as the first step we will need toÂ encode the problem variables into genes. The genes can be a string of real numbers or a binary bit string (series of 0s and 1’s). This represents a potential solution (individual) and many such solutions together form the population at timeÂ t. For instance, consider a problem where we need to find two variables,Â aÂ andÂ b, such that the two lie in the range (0, 255).Â For binary gene representation, these two variables can be represented by a 16bit chromosome, with theÂ higher 8 bits representing geneÂ aÂ and theÂ lower 8 bits forÂ b.Â The encoding will need to be later decoded to get the real values of the variablesÂ aÂ andÂ b.
The second important requirement for genetic algorithms is defining a properÂ fitness function, which calculates the fitness score of any potential solution (in the preceding example, it should calculate the fitness value of the encoded chromosome). This is the function that we want to optimize by finding the optimum set of parameters of the system or the problem at hand. The fitness function is problemdependent. For example, in the natural process of evolution, the fitness function represents the organism’s ability to operate and to survive in its environment.
Pros and cons of Genetic algorithm
Genetic algorithms sound cool, right! Now, before we try and build code around them, let’s point out certain advantages and disadvantages of genetic algorithms.
Advantages
Genetic algorithms offer some intriguing advantages and can produce results when the tradition gradientbased approaches fail:
 They can be used to optimize either continuous or discrete variables.
 Unlike gradient descent, we do not require derivative information, which also means that there is no need for the fitness function to be continuous and differentiable.
 It can simultaneously search from a wide sampling of the cost surface.
 We can deal with a large number of variables without a significant increase in computation time.
 The generation ofÂ the populationÂ and calculatingÂ theirÂ fitness values can be performed in parallel, and henceÂ genetic algorithms are well suited for parallel computers.
 They can work even when the topological surface is extremely complex because crossover and mutation operators help them in jumping out of a local minimum.
 They can provide more than one optimum solution.
 We can use them with numerically generated data, experimental data, or even analytical functions. They specifically work well for largescale optimization problems.
Disadvantages
Despite the previously mentioned advantages, we still do not findÂ genetic algorithms to be a ubiquitous solution to all optimization problems. This is for the following reasons:
 If the optimization function is a wellbehaved convex function, then gradientbased methods will give a faster convergence
 The large population of solutions that helpsÂ genetic algorithms cover the search space more extensively also results in slow convergence
 Designing a fitness function can be a daunting task
CodingÂ genetic algorithms using Distributed Evolutionary Algorithms in Python
Now that we understand howÂ genetic algorithms work, let’s try solving some problems with them. They have been used to solve NPhard problems such as the traveling salesman problem. To make the task of generating a population, performing the crossover, and performing mutation operations easy, we will make use ofÂ Distributed Evolutionary Algorithms in PythonÂ (DEAP). It supports multiprocessing and we can use it for other evolutionary algorithms as well. You can download DEAP directly fromÂ PyPiÂ using this:
pip install deap
It is compatible with Python 3.
To learn more about DEAP, you can refer to its GitHub repositoryÂ and its user’s guide.
Guess the word
In this program, we useÂ genetic algorithms to guess a word. The genetic algorithm will know the number of letters in the word and will guess those letters until it finds the right answer. We decide to represent the genes as a single alphanumeric character; strings of these characters thus constitute a chromosome. And our fitness function is the sum of the characters matching in the individual and the right word:
 As the first step, we import the modules we will need. We useÂ theÂ stringÂ module andÂ theÂ randomÂ module to generate random characters fromÂ (aâ€”z, Aâ€”Z, and 0â€”9). From the DEAP module, we useÂ creator,Â base, andÂ tools:
import string import random from deap import base, creator, tools
 In DEAP, we start with creating a class that inherits from theÂ deep.baseÂ module. We need to tell it whether we are going to have a minimization or maximization of the function; this is done using the weights parameter. A value ofÂ +1Â means we are maximizing (for minimizing, we give the valueÂ 1.0). The following code line will create a class,Â FitnessMax, that will maximize the function:
creator.create("FitnessMax", base.Fitness, weights=(1.0,))
 We also define anÂ IndividualÂ class, which will inherit the class list, and tell the DEAP creator module to assignÂ FitnessMaxÂ as itsÂ fitnessÂ attribute:
creator.create("Individual", list, fitness=creator.FitnessMax)
 Now, with theÂ IndividualÂ class defined, we use theÂ toolboxÂ of DEAP defined in the base module. We will use it to create a population and define our gene pool.Â All the objects that we will need from now onwardâ€”an individual, the population, the functions, the operators, and the argumentsâ€”are stored in a container calledÂ toolbox.Â We can add or remove content to/from theÂ toolboxÂ containerÂ using theÂ register()Â andÂ unregister()Â methods:
toolbox = base.Toolbox() # Gene Pool toolbox.register("attr_string", random.choice, \ string.ascii_letters + string.digits )
 Now that we have defined how the gene pool will be created, we create an individual and then a population by repeatedly using the Individual class. We will pass the class to the toolbox responsible for creating aÂ N parameterÂ , telling it how many genes to produce:
#Number of characters in word # The word to be guessed word = list('hello') N = len(word) # Initialize population toolbox.register("individual", tools.initRepeat, \ creator.Individual, toolbox.attr_string, N ) toolbox.register("population",tools.initRepeat, list,\ toolbox.individual)
 We define theÂ fitnessÂ function. Note the comma inÂ theÂ return statement. This is becauseÂ theÂ fitness function in DEAP is returned as a tuple to allow multiobjectiveÂ fitnessÂ functions:
def evalWord(individual, word): return sum(individual[i] == word[i] for i in\ range(len(individual))),
 Add the fitness function to the container. Also, add the crossover operator, mutation operator, and parent selector operator. You can see that, for this, we are using the register function. In the first statement, we register the fitness function that we have defined, along with the additional arguments it will take. The next statement registers the crossover operation; it specifies that here we are using a twopoint crossover (cxTwoPoint). Next, we register the mutation operator; we choose theÂ mutShuffleIndexesÂ option, which shuffles the attributes of the input individual with a probabilityÂ indpb=0.05. And finally, we define how the parents are selected; here, we have defined the method of selection as tournament selection with a tournament size ofÂ 3:
toolbox.register("evaluate", evalWord, word) toolbox.register("mate", tools.cxTwoPoint) toolbox.register("mutate", tools.mutShuffleIndexes, indpb=0.05) toolbox.register("select", tools.selTournament, tournsize=3)
 Now we have all the ingredients, so we willÂ write downÂ the code of the genetic algorithm, which will perform the steps we mentioned earlier in a repetitive manner:
def main():
random.seed(64)
# create an initial population of 300 individuals
pop = toolbox.population(n=300)
# CXPB is the crossover probability
# MUTPB is the probability for mutating an individual
CXPB, MUTPB = 0.5, 0.2
print("Start of evolution")
# Evaluate the entire population
fitnesses = list(map(toolbox.evaluate, pop))
for ind, fit in zip(pop, fitnesses):
ind.fitness.values = fit
print(" Evaluated %i individuals" % len(pop))
# Extracting all the fitnesses of individuals in a list
fits = [ind.fitness.values[0] for ind in pop]
# Variable keeping track of the number of generations
g = 0
# Begin the evolution
while max(fits) < 5 and g < 1000:
# A new generation
g += 1
print(" Generation %i " % g)
# Select the next generation individuals
offspring = toolbox.select(pop, len(pop))
# Clone the selected individuals
offspring = list(map(toolbox.clone, offspring))
# Apply crossover and mutation on the offspring
for child1, child2 in zip(offspring[::2], offspring[1::2]):
# cross two individuals with probability CXPB
if random.random() < CXPB:
toolbox.mate(child1, child2)
# fitness values of the children
# must be recalculated later
del child1.fitness.values
del child2.fitness.values
for mutant in offspring:
# mutate an individual with probability MUTPB
if random.random() < MUTPB:
toolbox.mutate(mutant)
del mutant.fitness.values
# Evaluate the individuals with an invalid fitness
invalid_ind = [ind for ind in offspring if not ind.fitness.valid]
fitnesses = map(toolbox.evaluate, invalid_ind)
for ind, fit in zip(invalid_ind, fitnesses):
ind.fitness.values = fit
print(" Evaluated %i individuals" % len(invalid_ind))
# The population is entirely replaced by the offspring
pop[:] = offspring
# Gather all the fitnesses in one list and print the stats
fits = [ind.fitness.values[0] for ind in pop]
length = len(pop)
mean = sum(fits) / length
sum2 = sum(x*x for x in fits)
std = abs(sum2 / length  mean**2)**0.5
print(" Min %s" % min(fits))
print(" Max %s" % max(fits))
print(" Avg %s" % mean)
print(" Std %s" % std)
print(" End of (successful) evolution ")
best_ind = tools.selBest(pop, 1)[0]
print("Best individual is %s, %s" % (''.join(best_ind),\
best_ind.fitness.values))
9. Here, you can see the result of this genetic algorithm. In seven generations, we reached the rightÂ word:
Genetic algorithm for LSTM optimization
In a genetic CNN, we useÂ genetic algorithms to estimate the optimum CNN architecture; in genetic RNN, we will now use a genetic algorithm to findÂ theÂ optimum hyperparameters ofÂ theÂ RNN, the window size, and the number of hidden units. We will find the parameters that reduce theÂ rootmeansquare errorÂ (RMSE) of the model. The hyperparameters window size and number of units are again encoded in a binary string with 6 bits for window size and 4 bits for the number of units. Thus, the complete encoded chromosome will be of 10 bits. The LSTM is implemented using Keras. The code we implement is taken fromÂ https://github.com/aqibsaeed/GeneticAlgorithmRNN:

 The necessary modules are imported. This time, we are using Keras to implementÂ theÂ LSTM model:
import numpy as np import pandas as pd from sklearn.metrics import mean_squared_error from sklearn.model_selection import train_test_split as split from keras.layers import LSTM, Input, Dense from keras.models import Model from deap import base, creator, tools, algorithms from scipy.stats import bernoulli from bitstring import BitArray np.random.seed(1120)
 The dataset we need for LSTM has to be time series data; we use the windpower forecasting data from Kaggle (https://www.kaggle.com/c/GEF2012windforecasting/data):
data = pd.read_csv('train.csv') data = np.reshape(np.array(data['wp1']),(len(data['wp1']),1)) train_data = data[0:17257] test_data = data[17257:]
 Define a function to prepare the dataset depending upon the chosenÂ window_size:
def prepare_dataset(data, window_size): X, Y = np.empty((0,window_size)), np.empty((0)) for i in range(len(data)window_size1): X = np.vstack([X,data[i:(i + window_size),0]]) Y = np.append(Y,data[i + window_size,0]) X = np.reshape(X,(len(X),window_size,1)) Y = np.reshape(Y,(len(Y),1)) return X, Y
 TheÂ train_evaluateÂ functionÂ creates the LSTM network for a given individual and returns its RMSE value (fitness function):
def train_evaluate(ga_individual_solution): # Decode genetic algorithm solution to integer for window_size and num_units window_size_bits = BitArray(ga_individual_solution[0:6]) num_units_bits = BitArray(ga_individual_solution[6:]) window_size = window_size_bits.uint num_units = num_units_bits.uint print('\nWindow Size: ', window_size, ', Num of Units: ', num_units) # Return fitness score of 100 if window_size or num_unit is zero if window_size == 0 or num_units == 0: return 100, # Segment the train_data based on new window_size; split into train and validation (80/20) X,Y = prepare_dataset(train_data,window_size) X_train, X_val, y_train, y_val = split(X, Y, test_size = 0.20, random_state = 1120) # Train LSTM model and predict on validation set inputs = Input(shape=(window_size,1)) x = LSTM(num_units, input_shape=(window_size,1))(inputs)
predictions = Dense(1, activation='linear')(x) model = Model(inputs=inputs, outputs=predictions) model.compile(optimizer='adam',loss='mean_squared_error') model.fit(X_train, y_train, epochs=5, batch_size=10,shuffle=True) y_pred = model.predict(X_val) # Calculate the RMSE score as fitness score for GA rmse = np.sqrt(mean_squared_error(y_val, y_pred)) print('Validation RMSE: ', rmse,'\n') return rmse,
 Next, we use DEAP tools to define Individual (again, since the chromosome is represented by a binary encoded string (10 bits), we use Bernoulli’s distribution), create the population, use ordered crossover, use mutShuffleIndexes mutation, and use the roulette wheel selection for selecting the parents:
population_size = 4 num_generations = 4 gene_length = 10 # As we are trying to minimize the RMSE score, that's why using 1.0. # In case, when you want to maximize accuracy for instance, use 1.0 creator.create('FitnessMax', base.Fitness, weights = (1.0,)) creator.create('Individual', list , fitness = creator.FitnessMax) toolbox = base.Toolbox() toolbox.register('binary', bernoulli.rvs, 0.5) toolbox.register('individual', tools.initRepeat, creator.Individual, toolbox.binary, n = gene_length) toolbox.register('population', tools.initRepeat, list , toolbox.individual) toolbox.register('mate', tools.cxOrdered) toolbox.register('mutate', tools.mutShuffleIndexes, indpb = 0.6) toolbox.register('select', tools.selRoulette) toolbox.register('evaluate', train_evaluate) population = toolbox.population(n = population_size) r = algorithms.eaSimple(population, toolbox, cxpb = 0.4, mutpb = 0.1, ngen = num_generations, verbose = False)
 We get the best solution, as follows:
best_individuals = tools.selBest(population,k = 1) best_window_size = None best_num_units = None for bi in best_individuals: window_size_bits = BitArray(bi[0:6]) num_units_bits = BitArray(bi[6:]) best_window_size = window_size_bits.uint best_num_units = num_units_bits.uint print('\nWindow Size: ', best_window_size, ', Num of Units: ', best_num_units)
 And finally, weÂ implement the best LSTM solution:
X_train,y_train = prepare_dataset(train_data,best_window_size) X_test, y_test = prepare_dataset(test_data,best_window_size) inputs = Input(shape=(best_window_size,1)) x = LSTM(best_num_units, input_shape=(best_window_size,1))(inputs) predictions = Dense(1, activation='linear')(x) model = Model(inputs = inputs, outputs = predictions) model.compile(optimizer='adam',loss='mean_squared_error') model.fit(X_train, y_train, epochs=5, batch_size=10,shuffle=True) y_pred = model.predict(X_test) rmse = np.sqrt(mean_squared_error(y_test, y_pred)) print('Test RMSE: ', rmse)
Yay! Now, you have the best LSTM network for predicting wind power.
In this article, we looked at an interesting natureinspired algorithm family: genetic algorithms. We learned how to convert our optimization problems into a form suitable for genetic algorithms. Crossover and mutation, two very crucial operations in genetic algorithms, were explained.Â We applied what we learned from two very different optimization problems. We used it to guess a word and to find the optimum hyperparameters for an LSTM network.Â Â If you want to explore more topics related to genetic algorithms, be sure to check out the bookÂ ‘HandsOn Artificial Intelligence for IoT’.
Read Next
How AI is transforming the Smart Cities IoT? [Tutorial]
How to stop hackers from messing with your home network (IoT)
Defending your business from the next wave of cyberwar: IoT Threats