遗传算法及Python实现

遗传算法虽然相比于深度学习，神经网络等新兴的人工智能算法过于原始，相比于有严格数学理论的优化算法又显得那么粗糙，但是作为一类思路独特的算法还是值得了解一下的。

介绍

遗传算法（Genetic Algorithm）是美国J.Holland教授于1975年首先提出的一种启发式优化方法，灵感来自于进化理论：它模拟了生物进化过程中的自然选择、遗传机制和变异规律，用于解决搜索和优化问题。遗传算法的基本思想是将问题的解表示为一组基因序列，然后使用遗传操作（如选择、交叉和变异）对这些基因序列进行进化（迭代），使优秀的解逐渐从种群中浮现出来。

遗传算法作为一种启发式优化方法，具有如下特点：

适用性：遗传算法的计算框架是普适性的，可以应用于求解各种优化问题，特别是对于搜索空间庞大、复杂的优化问题，如组合优化、参数优化等
并行性：可以并行处理多个解决方案，提高搜索效率
全局或局部最优：由于具有随机性和多样性，遗传算法有助于跳出局部最优解，更有可能找到全局最优解，但是这依赖于初始种群的生成和参数的设置
灵活性/复杂性：包含很多待定参数，需要根据问题的特点进行调整和改进，这既是算法的灵活性体现，也是算法自身的复杂性
维度问题：遗传算法的计算策略是基于群体的进化而不是单个解的搜索，对于高维问题的计算量不会随着维度而剧增，但是可能需要更多的迭代次数
收敛性：数学上无法证明任何收敛速度，这是遗传算法的巨大劣势，同时由于算法的随机性，求解结果可能不稳定，有时难以给出可靠的结果

基本概念

遗传算法中有很多模仿生物进化理论的基本概念：

个体（Individual）：可行解
种群（Population）：可行域
染色体（Chromosome）：可行解所对应的编码
基因（Gene）：可行解编码的组成部分（染色体由基因组成）
适应性评估（Fitness Assignment）：计算目标函数的函数值（希望目标函数最大化），得到适应度
选择（Selection）：保留适应度更高的可行解（与进化论的生存斗争和适者生存相对应）
遗传/繁殖：用于在现有可行解的基础上产生新的可行解，包括两类方法（与生物遗传相对应）
- 交叉（Crossover）：将两个可行解内的组分随机交换（染色体交换基因片段）
- 变异（Mutation）：随机挑选可行解中的某些组分进行变异（基因突变）
进化迭代: 重复进行选择和遗传过程，不断生成新的个体群体，直到达到终止条件

遗传算法可以求解如下的一般最优化问题 \[ x^* = \text{argmax}_{x \in \Omega} f(x) \] 其中目标函数\(f\)不需要任何假设，只需要保证全局最优解是存在的，并且可行解 \(x \in \Omega\) 可以通过一种合适的方法转换为基因编码即可。

遗传算法的流程如下，通常设置固定的迭代次数来终止

求解示例

这里参考网上的教程，选择一个比较直观的一维连续优化问题 \[ x^* = \text{argmax}_{x \in \Omega} f(x) = \text{argmax}_{x \in [0,9]} x+10\sin(5x)+7\cos(4x) \] \(f\)的函数曲线如下图，全局最优解在 \(x^*=7.85\) 附近，注意到有很多局部最优解。

编码转换

我们需要将可行解（0-9的浮点数）转换为代表染色体的编码，例如转化为固定长度的01字符串，指定字符串的长度为\(n\)，则一共可以表示\(2^n\)个值，我们选择\(n=17\)可以保证精度\(\varepsilon = (9-0)/2^{17} \approx 0.0000686646\)。

浮点数和染色体编码的转换公式如下

# 编码函数
# 将实数值映射为二进制编码（基因型）
def encode(value, lower, upper, chromosome_size):
    normalized_value = (value - lower) / (upper - lower)  # 归一化到 [0, 1] 区间
    decimal_value = int(normalized_value * (2**chromosome_size - 1))  # 将实数值转换为十进制
    binary_string = format(decimal_value, f"0{chromosome_size}b")  # 格式化为二进制字符串
    return binary_string


# 解码函数
# 将二进制编码（基因型）解码为实数值（表现型）
def decode(chromosome, lower, upper):
    decimal_value = int(chromosome, 2)  # 将二进制编码转换为十进制值
    decoded_value = lower + decimal_value * (upper - lower) / (2 ** len(chromosome) - 1)
    return decoded_value

示例如下

chromosome = "11010100110001"
print("编码:", chromosome)

result = decode(chromosome, 0, 9)
print("实数值:", result)

chromosome2 = encode(result, 0, 9, len(chromosome))
print("再次编码:", chromosome2)

其中编码函数通常是不需要的，随机初始生成的就是二进制编码。

初始化

我们可以随机生成初始的种群，需要设置染色体长度（即二进制编码长度）和种群数目这两个参数

# 初始化种群
def initialize_population(population_size, chromosome_size):
    population = []
    for _ in range(population_size):
        chromosome = "".join(
            [str(random.randint(0, 1)) for _ in range(chromosome_size)]
        )
        population.append(chromosome)
    return population

适应性评估

实现代码如下

# 自变量范围
x_upper = 9.0
x_lower = 0.0


# 目标函数，即适应度函数
# f(x) = x+10*sin(5*x)+7*cos(4*x)
def fitness_function(x):
    global f_lower
    return x + 10 * np.sin(5 * x) + 7 * np.cos(4 * x)


# 计算染色体的适应度值
def evaluate_fitness(population):
    global x_lower, x_upper
    fitness_values = []
    for chromosome in population:
        x = decode(chromosome, x_lower, x_upper)  # 解码为实数值 x
        fitness = fitness_function(x)  # 计算适应度
        fitness_values.append(fitness)
    return fitness_values

轮盘赌选择

实现代码如下

# 轮盘赌
# 随机生成一个与原始种群同规模的候选种群，用于后续的交叉变异
def select(population, fitness_values):
    selected = []
    min_fitness = min(fitness_values)
    if min_fitness < 0:
        # 调整适应度值，使其全部为非负数
        adjusted_fitness = [fitness - min_fitness for fitness in fitness_values]
    else:
        adjusted_fitness = fitness_values

    # 适应度决定了轮盘赌中指定位置抽到的概率
    total_fitness = sum(adjusted_fitness)
    probabilities = [fitness / total_fitness for fitness in adjusted_fitness]

    # 通过轮盘赌重复随机添加一个元素，直到生成与原种群规模相同的数据
    for _ in range(len(population)):
        selected.append(population[np.random.choice(len(population), p=probabilities)])
    return selected

交叉与变异

实现代码如下

# 交叉
def crossover(selected):
    new_population = []
    for i in range(0, len(selected), 2):  # 对基因组按照两两相邻元素配对，并随机交换它们的片段
        crossover_point = random.randint(1, len(selected[i]) - 1)  # 随机确定交换位置
        child1 = selected[i][:crossover_point] + selected[i + 1][crossover_point:]
        child2 = selected[i + 1][:crossover_point] + selected[i][crossover_point:]
        new_population.extend([child1, child2])
    return new_population


# 变异，包含一个变异率参数
def mutate(population, mutation_rate):
    for i in range(len(population)):
        if random.random() < mutation_rate:  # 随机判断是否变异和变异位置
            mutation_point = random.randint(0, len(population[i]) - 1)
            # 在变异位置将0换成1，将1换成0，生成新字符串
            population[i] = (
                population[i][:mutation_point]
                + str(1 - int(population[i][mutation_point]))
                + population[i][mutation_point + 1 :]
            )

    return population

主函数

遗传算法的主函数实现代码如下

# 遗传算法主函数
def genetic_algorithm(population_size, chromosome_size, generations, mutation_rate):
    global x_lower, x_upper

    # 随机初始化种群
    population = initialize_population(population_size, chromosome_size)

    # 迭代演化
    for _ in range(generations):
        fitness_values = evaluate_fitness(population)
        selected = select(population, fitness_values)
        population = mutate(crossover(selected), mutation_rate)

    # 最优解排序函数
    def sort_func(x):
        return fitness_function(decode(x, x_lower, x_upper))

    # 获取最优解并解码返回
    best_chromosome = max(population, key=sort_func)
    return decode(best_chromosome, x_lower, x_upper)

使用示例如下

population_size = 400  # 种群规模
chromosome_size = 17   # 编码长度
generations = 100      # 迭代次数
mutation_rate = 0.1    # 突变概率

best_solution = genetic_algorithm(
    population_size, chromosome_size, generations, mutation_rate
)

print("找到的最优解 x:", best_solution)
print("对应的适应度值 f(x):", fitness_function(best_solution))

运行结果如下

1 2	找到的最优解 x: 7.856657841932998 对应的适应度值 f(x): 24.85536152100544

求解封装

前面的示例是基于特定问题的，面向过程的程序，下面可以将遗传算法的主要过程封装成一个求解器类，使得求解过程更加简洁，并且可以适应一般性的问题。

import random
import numpy as np


# 遗传算法的类封装
class GASolver:
    # 初始化，需要传递适应度函数和解码函数，以及必要的参数
    def __init__(
        self,
        fitness_func,
        decode_func,
        population_size,
        chromosome_size,
        generations,
        mutation_rate,
    ):
        self.fitness_func = fitness_func
        self.decode_func = decode_func
        self.population_size = population_size
        self.chromosome_size = chromosome_size
        self.generations = generations
        self.mutation_rate = mutation_rate

    # 计算染色体的适应度值
    def _evaluate_fitness(self, population):
        fitness_values = []
        for chromosome in population:
            x = self.decode_func(chromosome)  # 解码为实数值 x
            fitness = self.fitness_func(x)  # 计算适应度
            fitness_values.append(fitness)
        return fitness_values

    # 初始化种群
    def _initialize_population(self):
        population = []
        for _ in range(self.population_size):
            chromosome = "".join(
                [str(random.randint(0, 1)) for _ in range(self.chromosome_size)]
            )
            population.append(chromosome)
        return population

    # 轮盘赌
    # 随机生成一个与原始种群同规模的候选种群，用于后续的交叉变异
    def _select(self, population, fitness_values):
        selected = []
        min_fitness = min(fitness_values)
        if min_fitness < 0:
            # 调整适应度值，使其全部为非负数
            adjusted_fitness = [fitness - min_fitness for fitness in fitness_values]
        else:
            adjusted_fitness = fitness_values

        # 适应度决定了轮盘赌中指定位置抽到的概率
        total_fitness = sum(adjusted_fitness)
        probabilities = [fitness / total_fitness for fitness in adjusted_fitness]

        # 通过轮盘赌重复随机添加一个元素，直到生成与原种群规模相同的数据
        for _ in range(len(population)):
            selected.append(
                population[np.random.choice(len(population), p=probabilities)]
            )
        return selected

    # 交叉
    def _crossover(self, selected):
        new_population = []
        for i in range(0, len(selected), 2):  # 对基因组按照两两相邻元素配对，并随机交换它们的片段
            crossover_point = random.randint(1, len(selected[i]) - 1)  # 随机确定交换位置
            child1 = selected[i][:crossover_point] + selected[i + 1][crossover_point:]
            child2 = selected[i + 1][:crossover_point] + selected[i][crossover_point:]
            new_population.extend([child1, child2])
        return new_population

    # 变异
    def _mutate(self, population):
        for i in range(len(population)):
            if random.random() < self.mutation_rate:  # 随机判断是否变异和变异位置
                mutation_point = random.randint(0, len(population[i]) - 1)
                # 在变异位置将0换成1，将1换成0，生成新字符串
                population[i] = (
                    population[i][:mutation_point]
                    + str(1 - int(population[i][mutation_point]))
                    + population[i][mutation_point + 1 :]
                )

        return population

    # 主程序
    def run(self):
        # 初始化种群
        population = self._initialize_population()

        # 迭代演化
        for _ in range(self.generations):
            fitness_values = self._evaluate_fitness(population)
            selected = self._select(population, fitness_values)
            population = self._mutate(self._crossover(selected))

        # 最优解排序函数
        def sort_func(x):
            return self.fitness_func(self.decode_func(x))

        # 获取最优解并解码返回
        best_chromosome = max(population, key=sort_func)
        best_x = self.decode_func(best_chromosome)
        best_value = self.fitness_func(best_x)
        return (best_x, best_value)

使用时需要提供适应度函数和解码函数，还有其它几个必要参数

# 目标函数，即适应度函数
# f(x) = x+10*sin(5*x)+7*cos(4*x)
def fitness_function(x):
    return x + 10 * np.sin(5 * x) + 7 * np.cos(4 * x)


# 解码函数
# 将二进制编码（基因型）解码为实数值（表现型）
def decode_function(chromosome):
    lower = 0.0
    upper = 9.0
    decimal_value = int(chromosome, 2)  # 将二进制编码转换为十进制值
    decoded_value = lower + decimal_value * (upper - lower) / (2 ** len(chromosome) - 1)
    return decoded_value


# 使用示例

solver = GASolver(
    fitness_function,
    decode_function,
    population_size=400,
    chromosome_size=17,
    generations=100,
    mutation_rate=0.1,
)

print("最优解:", solver.run())

运行结果如下

1	最优解: (7.856657841932998, 24.85536152100544)