Natural Computing Blog

GA Analysis

Christmas is approaching… Temperatures are dropping… and I keep busy experimenting with Swarm (Artificial) Intelligence.

So whilst on one side I’m still working on the “knapsack challenge” I embarked on at the end of my last video blog, this week I’m also going to break down how to adapt a mathematical task for Genetic Algorithms.

The problem

Assume that our GA has chromosomes of the following type:

ch = (x0, x1, x2, x3, x4, x5, x6, x7)

x0-7 can be any digit between zero to nine.
The fitness of each chromosome is going to be calculated using the following formula:

f(ch) = (x0 + x1) − (x2 + x3) + (x4 + x5 ) − (x6 + x7)

This problem is a maximisation problem.

In this scenario we initially have a population of four chromosomes as shown below:

  • ch1 = (6, 5, 4, 1, 3, 5, 3, 2)
  • ch2 = (8, 7, 1, 2, 6, 6, 0, 1)
  • ch3 = (2, 3, 9, 2, 1, 2, 8, 5)
  • ch4 = (4, 1, 8, 5, 2, 0, 9, 4)

SETUP:

For this problem we will use python. So let’s start and create a new .py script that we can edit and run with an IDE such as Spyder or with the combination of a text editor and a terminal command prompt.

We then want to create an agent class and a population class to store data in a nice way. Both will be implemented in a separate script, which will be sequentially called by the main script:

import numpy as np
import random
import numbers
class Agent :
    """A simple agent class for a genetic algorithm"""
 
    # declare the variables that we want for this class
    genotype = []
    length = 0
    fitness = 0
    def fitness_func():
    return 0
 
    def __init__(self, length_or_genotype, fitness_func) : # initialiser == constructor
        # initialise genotype array with an array of
        # randomly permutated numbers from 0 to length-1
        if isinstance(length_or_genotype, numbers.Number) :
            self.length = length_or_genotype
            self.genotype = np.random.permutation(length_or_genotype)
        # or directly with an externally passed genotype 
        else:
            self.genotype = length_or_genotype
            self.length = len(length_or_genotype)
        # then set the fitness function by passing it in
        self.fitness_func = fitness_func 

}

…then our population:

import agent as ag
import numpy as np
import random
import copy as cp

class Population :
     """A container of agents for a genetic algorithm"""
 
     #agent = ag.Agent(8) # initialise the agent object with a genotype lenght of 8
     length = 0
     members = []
     def fit_func(): return 0


     def __init__(self, pop_len, ag_len, fitness_func) :
          self.fit_func = fitness_func
          self.length = pop_len
          self.members = np.empty(pop_len, dtype=object)
          for i in range(0, self.length) :
          self.members[i] = ag.Agent(ag_len, fitness_func)
 
     @classmethod
     def from_pop_list(self, list_of_agents, fitness_func):
          self = self(len(list_of_agents), len(list_of_agents[0]), fitness_func)
          self.members = np.empty(self.length, dtype=object)
     for i in range(0, self.length):
          self.members[i] = ag.Agent(list_of_agents[i], fitness_func)
     return self
}

Whilst in our main script we will have:

from population import Population
import numpy as np
import pandas as pd

# This problem is a mathematical maximisation problem

# define custom fitness function:
# f(ch) = (x0 + x1) − (x2 + x3) + (x4 + x5 ) − (x6 + x7)
def fit_func(chrom):
     fitness = (chrom[0] + chrom[1]) - (chrom[2] + chrom[3])\
              + (chrom[4] + chrom[5]) - (chrom[6] + chrom[7])
     return fitness

# initialise a pupulation with 100 agents,
# each with a genotype: ch = (x0, x1, x2, x3, x4, x5, x6, x7)
# where x0-7 can be any digit between zero to nine. 
# then pass in the fitness function we've just defined
pop_list = [
 [6, 5, 4, 1, 3, 5, 3, 2],
 [8, 7, 1, 2, 6, 6, 0, 1],
 [2, 3, 9, 2, 1, 2, 8, 5],
 [4, 1, 8, 5, 2, 0, 9, 4]
 ]
p = Population.from_pop_list(pop_list, fit_func)

 

Now we have our basic program that will spawn a pre-defined population of agents, but could also create a random one of any give size if we wanted to

PART I:

Let’s stick to our ideal scenario with 4 agents and calculate the fitness for each individual. We’ll do this by adding a method for each agent to “assess itself” based on the fitness function that we pass in from the main script.

We will then create a method for the population class to evaluate all the agents in the swarm and sort them in ascending order.

Let’s start from the agent self-evaluation by adding the following class method:

 def evaluate(self):
     self.fitness = self.fitness_func(self.genotype)
     return self

Then le’ts go into our population script and add the following function:

 
 # evaluate all the members of the population
 # and sort them in order of fitness
 def evaluate(self):
     temp = cp.copy(self.members) #create a working copy of the members
     arr = np.zeros(self.length) #create an empty array of the same length
     for i in range(0,self.length): #evaluate the members and store their fitness
         self.members[i].evaluate()
         arr[i]=self.members[i].fitness 
     indx = np.argsort(arr) #use numpy's argsort to get a sorted array of indices
     for i in range(0,self.length):
          self.members[self.length-1-i] = temp[indx[i]]
     return self.members # return the population sorted in descending order

Now we can test this in our main script by trying out the following snippet:

for ind in p.members:
     print(ind.genotype, ind.fitness)

print('--------')
p.evaluate()
 
for ind in p.members:
     print(ind.genotype, ind.fitness)

If we run our main script we should now get this:

In [83]:
Reloaded modules: population, agent
[6, 5, 4, 1, 3, 5, 3, 2] 0
[8, 7, 1, 2, 6, 6, 0, 1] 0
[2, 3, 9, 2, 1, 2, 8, 5] 0
[4, 1, 8, 5, 2, 0, 9, 4] 0
--------
[8, 7, 1, 2, 6, 6, 0, 1] 23
[6, 5, 4, 1, 3, 5, 3, 2] 9
[2, 3, 9, 2, 1, 2, 8, 5] -16
[4, 1, 8, 5, 2, 0, 9, 4] -19

 

Brilliant! We have our population spawning and our fitness function works well when called for evaluation. The order of sorting is also correct.

 

PART II:

Now we want to apply the following crossover methods to the chromosomes:

  1. One-point crossover (at the middle) to cross the best two chromosomes. [This would lead in creating the first two children]
  2. Two-point crossover (after x1 and after x5) to cross the 2nd and 3rd best individual. [This would results in creating two extra children]
  3. Uniform crossover (the way you’d like to) to cross the 1st and 3rd best chromosomes. [Now we should have two more children; six in total]

So first of all we need to create the crossover methods as population methods. There will be three types of crossover so we will code 3 functions:

 def crossover_one(self, agentX, agentY) :
     chromo_len = agentX.length # length of a valid chromosome
     # create a 2D list to store children's chromosomes
     child1_gen = np.empty(chromo_len) 
     child2_gen = np.empty(chromo_len) 
 
     # one-point middle crosss
     mid_index = int(chromo_len/2)
 
     for i in range(0, mid_index):
         child1_gen[i]=agentX.genotype[i]
         child2_gen[i]=agentY.genotype[i]
     for i in range(mid_index, chromo_len):
         child2_gen[i]=agentX.genotype[i]
         child1_gen[i]=agentY.genotype[i]
 
     ch1 = ag.Agent(child1_gen, agentX.fitness_func)
     ch2 = ag.Agent(child2_gen, agentX.fitness_func)
 
     return ch1, ch2
 
 def crossover_two(self, agentX, agentY):
     chromo_len = agentX.length # length of a valid chromosome
     # create a 2D list to store children's chromosomes
     child1_gen = np.empty(chromo_len) 
     child2_gen = np.empty(chromo_len) 
 
     # two-point "fixed" crossings
     q1 = int(chromo_len/4)
     q2 = chromo_len - q1
 
     for i in range(0, q1):
         child1_gen[i]=agentX.genotype[i]
         child2_gen[i]=agentY.genotype[i]
     for i in range(q1, q2):
         child2_gen[i]=agentX.genotype[i]
         child1_gen[i]=agentY.genotype[i]
     for i in range(q2, chromo_len):
         child1_gen[i]=agentX.genotype[i]
         child2_gen[i]=agentY.genotype[i]
 
     ch1 = ag.Agent(child1_gen, agentX.fitness_func)
     ch2 = ag.Agent(child2_gen, agentX.fitness_func)
 
     return ch1, ch2
 
 
 def crossover_uni(self, agentX, agentY):
     chromo_len = agentX.length # length of a valid chromosome
     # create a 2D list to store children's chromosomes
     child1_gen = np.empty(chromo_len) 
     child2_gen = np.empty(chromo_len) 
    
     # Uniform crossing with 50% probability
     p = 0.5
 
     for i in range(0, chromo_len):
         u = np.random.random()
         if(u<p):
             child1_gen[i]=agentX.genotype[i] 
             child2_gen[i]=agentY.genotype[i] 
         else:
             child1_gen[i]=agentY.genotype[i] 
             child2_gen[i]=agentX.genotype[i] 
 
     ch1 = ag.Agent(child1_gen, agentX.fitness_func)
     ch2 = ag.Agent(child2_gen, agentX.fitness_func)
 
     return ch1, ch2

Then we need to write a method that we can call from the main script to advance by one generation (by calling the various crossing over reproductions)

 # go to the next generation:
 # Use one-point crossover to cross the best 2 chromosomes
 # Use two-point crossover to cross the 2nd and 3rd best individuals
 # Use uniform crossover to cross the 1st and 3rd best chromosomes
 def next_gen(self):
     self.evaluate()

     temp_pop = cp.copy(self.members)
     self.members = np.empty(6, dtype=object)
 
     self.members[0], self.members[1]=\
             self.crossover_one(temp_pop[0],temp_pop[1])
     self.members[2], self.members[3]=\
             self.crossover_one(temp_pop[1],temp_pop[2])
     self.members[4], self.members[5]=\
             self.crossover_one(temp_pop[0],temp_pop[2])

PART III:

After preparing the methods, we can test what we obtain with this “next generation” method by testing it in the main script. We will add the following lines to the previous test:

for ind in p.members:
    print(ind.genotype, ind.fitness)

print('--------')
    p.evaluate()
 
for ind in p.members:
    print(ind.genotype, ind.fitness)

print('--------') 
p.next_gen()
 
p.evaluate()

for ind in p.members:
    print(ind.genotype, ind.fitness)

Which will yield a result similar to this:

In [113]:
Reloaded modules: population, agent
[6, 5, 4, 1, 3, 5, 3, 2] 0
[8, 7, 1, 2, 6, 6, 0, 1] 0
[2, 3, 9, 2, 1, 2, 8, 5] 0
[4, 1, 8, 5, 2, 0, 9, 4] 0
--------
[8, 7, 1, 2, 6, 6, 0, 1] 23
[6, 5, 4, 1, 3, 5, 3, 2] 9
[2, 3, 9, 2, 1, 2, 8, 5] -16
[4, 1, 8, 5, 2, 0, 9, 4] -19
--------
[ 6. 5. 4. 1. 6. 6. 0. 1.] 17.0
[ 8. 7. 1. 2. 3. 5. 3. 2.] 15.0
[ 2. 3. 9. 2. 6. 6. 0. 1.] 5.0
[ 8. 7. 1. 2. 1. 2. 8. 5.] 2.0
[ 2. 3. 9. 2. 3. 5. 3. 2.] -3.0
[ 6. 5. 4. 1. 1. 2. 8. 5.] -4.0

From which we see that the “worst” part of the population is fitter, but that the best overall has actually a lower fitness

PART IV:

Now, since the mathematical formula is quite simple, can we think of which “genetic” (i.e. numerical) sequence would obtain the best possible fitness?

Let’s remind ourselves about the equation: f(ch) = (x0 + x1) − (x2 + x3) + (x4 + x5 ) − (x6 + x7)

Clearly, with genes being numbers between 0 and 9, we can see that the optimal solution would be:

(9+9) – (0+0) + (9+9) – (0+0), or a list of the form [9,9,0,0,9,9,0,0]

This, however should bring about a question for us…

PART V:

Looking at the initial population and based on the answer to part IV, is it possible to reach the optimal state without the mutation operator? In case the word mutation brought confusion, let’s state that the mutation operator is a function that  selects random genes in random individuals and change them into any other valid gene.

If we try and visualise what the crossover does by following the code, we realise that the crossover methods literally mix pieces of code from two parents so that the children have some genes from one and some from the second.

Of course, this means that the only possible children chromosomes in any generation are recombinations of the parents’ ones. Therefore, the answer is no, it is not possible to reach the best fitness just without mutating genes in the individuals.

PART VI:

Back to our code. Let’s now implement GA to solve this problem. We will keep the population size constant from one generation to the next, purely to stay loyal to the soul of the GA inventor!

We will first add mutation as an operator for our Agent class:

 def mutate(self, chance_as_n_between_0_and_1=0.05):
     for i in range(0, self.length):
          self.genotype[i] = int(random.uniform(0, 10)) if \
                   (random.uniform(0, 1) < chance_as_n_between_0_and_1) else self.genotype[i]

And then modify our next_gen() population method to incorporate the mutate operator:

 def next_gen(self):
     for i in range(0,self.length):
          self.members[i].mutate()
 
     self.evaluate() 

...

So that we can call our next_gen() method in the main script and advance by one generation:

p.evaluate()

for ind in p.members:
    print(ind.genotype, ind.fitness)

print('--------') 
p.next_gen()
 
p.evaluate()

for ind in p.members:
    print(ind.genotype, ind.fitness)

From now on the population will in of 6 individuals and will continue to change at each iteration. Our test n#1 was no different from our previous test (mutation did not take place by chance), whilst our test n#2 yielded this result:

In [181]: runfile('/Users/pesa/Documents/UNI/natComp/ga2/ga_algo.py', wdir='/Users/pesa/Documents/UNI/natComp/ga2')
Reloaded modules: population, agent
[8, 7, 1, 2, 6, 6, 0, 1] 23
[6, 5, 4, 1, 3, 5, 3, 2] 9
[2, 3, 9, 2, 1, 2, 8, 5] -16
[4, 1, 8, 5, 2, 0, 9, 4] -19
--------
[ 6. 5. 4. 1. 6. 6. 0. 1.] 17.0
[ 8. 3. 1. 2. 3. 5. 3. 2.] 11.0
[ 2. 3. 9. 2. 6. 6. 0. 1.] 5.0
[ 8. 3. 1. 2. 7. 2. 8. 5.] 4.0
[ 6. 5. 4. 1. 7. 2. 8. 5.] 2.0
[ 2. 3. 9. 2. 3. 5. 3. 2.] -3.0

By looping our “next_gen()” call in the main script we can simulate an evolutionary progression.

generations = 10
for i in range (0, generations): 
 p.next_gen()

In the above case we obtained this result at our first trial:

[ 8. 8. 1. 2. 6. 8. 0. 1.] 26.0
[ 8. 8. 1. 2. 6. 8. 0. 1.] 26.0
[ 8. 8. 1. 2. 6. 8. 0. 1.] 26.0
[ 8. 8. 1. 2. 6. 8. 0. 1.] 26.0
[ 8. 8. 1. 2. 6. 6. 0. 1.] 24.0
[ 8. 8. 1. 2. 6. 6. 0. 1.] 24.0

Whilst increasing the generations to 20 brought about this final population:

[ 9. 7. 1. 2. 9. 6. 0. 1.] 27.0
[ 9. 7. 1. 2. 9. 6. 0. 1.] 27.0
[ 9. 7. 1. 2. 9. 6. 0. 1.] 27.0
[ 9. 7. 1. 2. 9. 6. 0. 1.] 27.0
[ 9. 7. 1. 2. 9. 6. 0. 1.] 27.0
[ 9. 7. 1. 2. 9. 6. 0. 1.] 27.0

With 100, we obtained immediately our optimal solution:

[ 9. 9. 0. 0. 9. 9. 0. 0.] 36.0
[ 9. 9. 0. 0. 9. 9. 0. 0.] 36.0
[ 9. 9. 0. 0. 9. 9. 0. 0.] 36.0
[ 9. 9. 0. 0. 9. 9. 0. 0.] 36.0
[ 9. 9. 0. 0. 9. 9. 0. 0.] 36.0
[ 9. 9. 0. 0. 9. 9. 0. 0.] 36.0

CONCLUSION:

We have seen how a standard Genetic Algorithm works, step-by-step. We have also seen the importance of both reproduction with crossover and mutation, whilst indirectly observing how the best fitness can be “transmitted” across generations even without specifically keeping the fittest individuals.

This concludes our study and step-by-step coding of the algorithm. You can find the source code, as usual, on my gitHub.

 

Ciao!

 

Francesco

Leave a Reply

Your e-mail address will not be published. Required fields are marked *