Autoscale population and number of generations

I often find it challenging to determine the optimal population size and number of generations for genetic algorithms. Previously, I relied on a fixed number of generations (typically 64) and scaled the population size linearly with the number of variables (e.g., 64 per variable). However, this approach sometimes resulted in insufficient sampling in systems with multiple objectives (>5). Once niches in the objective space become underpopulated early in the process, later generations tend to refine other existing niches rather than explore new region with potential sharp (local) optima, leading to weak exploration of the objective space. Hence, given a fixed computational budget, it is often preferable to allocate resources toward a larger population size rather than more generations, as this promotes better exploration.

To address this, I’ve developed a heuristic that introduces a complexity term, defined as \sqrt{n_{\text{var}} \times n_{\text{obj}}}, to scale both the population size and the number of generations. This approach aims to balance exploration and computational efficiency more effectively.

Optionally, an eval_budget can be specified to limit the total number of evaluations. In this case, the population size is still scaled according to the heuristic, but the number of generations is determined by the budget and the population size.

Note, in addition to the population size, this also scales the number of reference directions which are required for NSGA-like algorithms (which we mostly use).

import math


def population_and_generations(
    n_variables: int,
    n_objectives: int,
    base_ref_div: int = 4,
    ref_cap: int = 512,
    base_pop: int = 32,
    base_gen: int = 16,
    eval_budget: int | None = None,
    min_pop_size: int = 64,
) -> tuple[int, int, int]:
    """
    Determine population size and number of generations for NSGA-III / U-NSGA-III.

    Scales both population size and generations with problem complexity (M × D),
    using controlled growth factors to avoid rapid increases.

    Parameters
    ----------
    n_variables : int
        Number of decision variables.
    n_objectives : int
        Number of objectives.
    base_ref_div : int, optional
        Reference direction divisions for small M (default: 4).
    ref_cap : int, optional
        Maximum number of reference directions (default: 512).
    base_pop : int, optional
        Base population size (default: 32).
    base_gen : int, optional
        Base number of generations (default: 16).
    eval_budget : int or None, optional
        Total function evaluations. If None, use heuristic budget.
    min_pop_size : int, optional
        Minimum population size (default: 64).

    Returns
    -------
    pop_size : int
        Recommended population size.
    n_gen : int
        Recommended number of generations.
    n_ref : int
        Effective number of reference directions.
    """
    # --- Reference directions (capped) ---
    try:
        n_ref = math.comb(n_objectives + base_ref_div - 1, base_ref_div)
    except OverflowError:
        n_ref = ref_cap
    n_ref = min(n_ref, ref_cap)

    # --- Problem complexity ---
    complexity = (n_objectives * n_variables)**0.5

    # --- Population size (scaled by complexity) ---
    pop_size = max(min_pop_size, int(base_pop * complexity))

    # --- Generations (scaled by complexity) ---
    if eval_budget is None:
        eval_budget = pop_size * int(base_gen * complexity)
    n_gen = max(10, eval_budget // pop_size)

    return pop_size, n_gen, n_ref

For example,

cases = [(2, 1), (2, 6), (2, 9), (4, 6), (6, 6)]
for (n_var, n_obj) in cases:
    print(f"{n_var=}, {n_obj=}")
    pop_size, n_max_gen, n_ref = population_and_generations(
        n_var,
        n_obj,
        # eval_budget=8192
    )
    print(
        f"{pop_size=}, "
        f"{n_max_gen=}, "
        f"{n_ref=}, "
        f"n_total_evaluations={pop_size*n_max_gen}"
    )

Leads to:

n_var=2, n_obj=1
pop_size=64, n_max_gen=22, n_ref=1, n_total_evaluations=1408
n_var=2, n_obj=6
pop_size=110, n_max_gen=55, n_ref=126, n_total_evaluations=6050
n_var=2, n_obj=9
pop_size=135, n_max_gen=67, n_ref=495, n_total_evaluations=9045
n_var=4, n_obj=6
pop_size=156, n_max_gen=78, n_ref=126, n_total_evaluations=12168
n_var=6, n_obj=6
pop_size=192, n_max_gen=96, n_ref=126, n_total_evaluations=18432

Let me know if you have any thoughts on this. :nerd_face:

4 Likes

Neat, I think it’s really clever to do it that way!

In terms of runtime, to you feel a larger exploration throughout larger pop size and less generations in these complex cases can actually save compute time, or is the goal more to find a robust solution here?