I often find it challenging to determine the optimal population size and number of generations for genetic algorithms. Previously, I relied on a fixed number of generations (typically 64) and scaled the population size linearly with the number of variables (e.g., 64 per variable). However, this approach sometimes resulted in insufficient sampling in systems with multiple objectives (>5). Once niches in the objective space become underpopulated early in the process, later generations tend to refine other existing niches rather than explore new region with potential sharp (local) optima, leading to weak exploration of the objective space. Hence, given a fixed computational budget, it is often preferable to allocate resources toward a larger population size rather than more generations, as this promotes better exploration.
To address this, I’ve developed a heuristic that introduces a complexity term, defined as \sqrt{n_{\text{var}} \times n_{\text{obj}}}, to scale both the population size and the number of generations. This approach aims to balance exploration and computational efficiency more effectively.
Optionally, an eval_budget can be specified to limit the total number of evaluations. In this case, the population size is still scaled according to the heuristic, but the number of generations is determined by the budget and the population size.
Note, in addition to the population size, this also scales the number of reference directions which are required for NSGA-like algorithms (which we mostly use).
import math
def population_and_generations(
n_variables: int,
n_objectives: int,
base_ref_div: int = 4,
ref_cap: int = 512,
base_pop: int = 32,
base_gen: int = 16,
eval_budget: int | None = None,
min_pop_size: int = 64,
) -> tuple[int, int, int]:
"""
Determine population size and number of generations for NSGA-III / U-NSGA-III.
Scales both population size and generations with problem complexity (M × D),
using controlled growth factors to avoid rapid increases.
Parameters
----------
n_variables : int
Number of decision variables.
n_objectives : int
Number of objectives.
base_ref_div : int, optional
Reference direction divisions for small M (default: 4).
ref_cap : int, optional
Maximum number of reference directions (default: 512).
base_pop : int, optional
Base population size (default: 32).
base_gen : int, optional
Base number of generations (default: 16).
eval_budget : int or None, optional
Total function evaluations. If None, use heuristic budget.
min_pop_size : int, optional
Minimum population size (default: 64).
Returns
-------
pop_size : int
Recommended population size.
n_gen : int
Recommended number of generations.
n_ref : int
Effective number of reference directions.
"""
# --- Reference directions (capped) ---
try:
n_ref = math.comb(n_objectives + base_ref_div - 1, base_ref_div)
except OverflowError:
n_ref = ref_cap
n_ref = min(n_ref, ref_cap)
# --- Problem complexity ---
complexity = (n_objectives * n_variables)**0.5
# --- Population size (scaled by complexity) ---
pop_size = max(min_pop_size, int(base_pop * complexity))
# --- Generations (scaled by complexity) ---
if eval_budget is None:
eval_budget = pop_size * int(base_gen * complexity)
n_gen = max(10, eval_budget // pop_size)
return pop_size, n_gen, n_ref
For example,
cases = [(2, 1), (2, 6), (2, 9), (4, 6), (6, 6)]
for (n_var, n_obj) in cases:
print(f"{n_var=}, {n_obj=}")
pop_size, n_max_gen, n_ref = population_and_generations(
n_var,
n_obj,
# eval_budget=8192
)
print(
f"{pop_size=}, "
f"{n_max_gen=}, "
f"{n_ref=}, "
f"n_total_evaluations={pop_size*n_max_gen}"
)
Leads to:
n_var=2, n_obj=1
pop_size=64, n_max_gen=22, n_ref=1, n_total_evaluations=1408
n_var=2, n_obj=6
pop_size=110, n_max_gen=55, n_ref=126, n_total_evaluations=6050
n_var=2, n_obj=9
pop_size=135, n_max_gen=67, n_ref=495, n_total_evaluations=9045
n_var=4, n_obj=6
pop_size=156, n_max_gen=78, n_ref=126, n_total_evaluations=12168
n_var=6, n_obj=6
pop_size=192, n_max_gen=96, n_ref=126, n_total_evaluations=18432
Let me know if you have any thoughts on this. ![]()