Configuration¶

ESO is driven by a single JSON file. The path must end in .json. The schema is enforced by typed dataclasses in eso.utils.settings. Unknown fields raise a ValueError at load time.

Entry point¶

from eso import ESO
ESO(settings_path="settings/my_experiment.json").run()

Editable templates live in settings/ at the repository root. The default values shown below come from eso.utils.settings. The recommended values, where they differ, are the values used in the published experiments.

File structure¶

{
  "algorithm":          { ... },
  "population":         { ... },
  "selection_operator": { ... },
  "genetic_operator":   { ... },
  "gene":               { ... },
  "chromosome":         { ... },
  "model":              { ... },
  "cnn_architecture":   { ... },
  "data":               { ... },
  "preprocessing":      { ... }
}

Each section maps to one dataclass. Missing fields take their defaults.

algorithm¶

Top-level GA control.

Field	Type	Default	Description
`max_generations`	`int`	`100`	Number of generations to evolve the population for. The paper uses `20`.

"algorithm": { "max_generations": 20 }

population¶

Field	Type	Default	Description
`pop_size`	`int`	`10`	Number of chromosomes per generation. Held constant across generations. The paper uses `300`.

"population": { "pop_size": 300 }

selection_operator¶

Field	Type	Default	Description
`tournament_size`	`int`	`10`	Number of chromosomes sampled for each tournament. The best of the sample becomes a parent.

"selection_operator": { "tournament_size": 5 }

See Parent selection.

genetic_operator¶

Operator rates and mutation step sizes. See Operators.

Field	Type	Default	Description
`mutation_rate`	`float`	`0.1`	Probability of generating an offspring through mutation.
`crossover_rate`	`float`	`0.8`	Probability of generating offspring through crossover.
`reproduction_rate`	`float`	`0.1`	Probability of copying a parent unchanged into the next generation.
`mutation_height_range`	`int`	`5`	Maximum $
`mutation_position_range`	`int`	`20`	Maximum $

The three rates must sum to 1. The paper's reported configuration is 0.3 / 0.6 / 0.1 for mutation, crossover, reproduction.

"genetic_operator": {
  "mutation_rate": 0.3,
  "crossover_rate": 0.6,
  "reproduction_rate": 0.1,
  "mutation_height_range": 5,
  "mutation_position_range": 20
}

gene¶

Constraints on the position $P_k$ and height $h_k$ of every gene. See Gene.

Field	Type	Default	Description
`min_position`	`int`	`0`	Lower bound on $P_k$.
`max_position`	`int`	`-1`	Upper bound on $P_k$. `-1` defaults to the spectrogram height $S_h$.
`min_height`	`int`	`4`	Lower bound on $h_k$.
`max_height`	`int`	`16`	Upper bound on $h_k$.
`band_position`	`int \\| null`	`null`	Fix $P_k$ to a single value for every gene. Use `null` or `-1` to disable.
`band_height`	`int \\| null`	`null`	Fix $h_k$ to a single value for every gene. Use `null` or `-1` to disable.
`spec_height`	`int \\| null`	`null`	Spectrogram height $S_h$. Filled in automatically from preprocessing.
`minimum_gene_height`	`int \\| null`	`null`	Minimum legal height given the convolution stack. Computed automatically.

"gene": {
  "min_position": 0,
  "max_position": -1,
  "min_height": 1,
  "max_height": 16
}

chromosome¶

Field	Type	Default	Description
`num_genes`	`int \\| null`	`null`	Fix the number of genes per chromosome. Use `null` or `-1` to draw from `[min_num_genes, max_num_genes]`.
`min_num_genes`	`int`	`3`	Lower bound on the number of genes when `num_genes` is disabled.
`max_num_genes`	`int`	`10`	Upper bound on the number of genes when `num_genes` is disabled.
`lambda_1`	`float`	`0.5`	Weight of the F1 term in the fitness function.
`lambda_2`	`float`	`0.5`	Weight of the parameter term in the fitness function.
`stack`	`bool`	`false`	If `true`, extracted bands are stacked along a depth axis and all heights must be equal. If `false`, bands are concatenated along the frequency axis.
`baseline_parameters`	`float \\| null`	`null`	Filled in automatically from the trained baseline.
`baseline_metric`	`int \\| null`	`null`	Filled in automatically from the trained baseline.

See Fitness for the equation. Paper values: $\lambda_1 = 0.95, \lambda_2 = 0.05$, except for Hainan gibbon ($\lambda_1 = 0.99, \lambda_2 = 0.01$).

"chromosome": {
  "min_num_genes": 1,
  "max_num_genes": 10,
  "lambda_1": 0.95,
  "lambda_2": 0.05,
  "stack": false
}

model¶

Training hyperparameters used by both the baseline and per-chromosome CNNs.

Field	Type	Default	Description
`optimizer_name`	`str`	`"adam"`	Optimiser. Currently `adam`.
`loss_function_name`	`str`	`"cross_entropy"`	Loss function.
`num_epochs`	`int`	`1`	Training epochs per CNN. The paper uses `30`.
`batch_size`	`int`	`128`	Mini-batch size. The paper uses `64`.
`learning_rate`	`float`	`0.001`	Learning rate for the optimiser.
`shuffle`	`bool`	`true`	Whether to shuffle batches during training.
`metric`	`str`	`"f1"`	Validation metric used as the F1 term in fitness. Supported values: `f1`, `accuracy`.

"model": {
  "optimizer_name": "adam",
  "loss_function_name": "cross_entropy",
  "num_epochs": 30,
  "batch_size": 64,
  "learning_rate": 0.001,
  "metric": "f1"
}

cnn_architecture¶

CNN topology shared by baseline and per-chromosome models.

Field	Type	Default	Description
`conv_layers`	`int`	`1`	Number of `Conv2d` blocks.
`conv_filters`	`int`	`8`	Filters per convolutional layer.
`conv_kernel`	`int`	`8`	Kernel size of each convolutional filter.
`conv_padding`	`str \\| null`	`null`	Padding strategy. `null` defaults to no padding.
`max_pooling_size`	`int`	`4`	Window size of `MaxPool2d`.
`stride_maxpool`	`int \\| null`	`null`	Stride for `MaxPool2d`. `null` matches `max_pooling_size`.
`fc_layers`	`int`	`2`	Number of fully-connected layers before the output.
`fc_units`	`int`	`32`	Units per fully-connected layer.
`dropout_rate`	`float`	`0.5`	Dropout applied after each fully-connected layer.

The paper's baseline corresponds to conv_layers = 1, conv_filters = 8, conv_kernel = 8, max_pooling_size = 4, fc_layers = 2, fc_units = 32.

"cnn_architecture": {
  "conv_layers": 1,
  "conv_filters": 8,
  "conv_kernel": 8,
  "max_pooling_size": 4,
  "fc_layers": 2,
  "fc_units": 32,
  "dropout_rate": 0.5
}

data¶

Field	Type	Default	Description
`species_folder`	`str`	`""`	Absolute path to the species' dataset directory.
`positive_class`	`str`	`""`	Folder or label of the positive class.
`negative_class`	`str`	`""`	Folder or label of the negative class.
`train_size`	`float`	`0.8`	Fraction of files in the training split.
`test_size`	`float`	`0.2`	Fraction of files in the test split. Validation gets the remainder.
`reshuffle`	`bool`	`false`	If `true`, re-randomises file assignment on every run.
`keep_in_memory`	`bool`	`false`	If `true`, holds spectrograms in RAM. Faster but memory-bound.
`force_recreate_dataset`	`bool`	`false`	If `true`, regenerates the cached dataset from audio.

"data": {
  "species_folder": "/data/gibbons",
  "positive_class": "gibbon",
  "negative_class": "no-gibbon",
  "train_size": 0.6,
  "test_size": 0.2
}

preprocessing¶

Mel-spectrogram generation. See Spectrogram preprocessing.

Field	Type	Default	Description
`sample_rate`	`int`	`32000`	Original recording sample rate.
`lowpass_cutoff`	`int`	`2000`	Cut-off frequency of the low-pass filter applied to the baseline dataset.
`downsample_rate`	`int`	`4800`	Downsample target for the baseline dataset. Set to twice the Nyquist rate.
`nyquist_rate`	`int`	`2400`	Maximum frequency in the target species' calls.
`segment_duration`	`int`	`4`	Fixed window length in seconds for segmentation.
`nb_negative_class`	`int`	`20`	Number of negative segments to extract per audio file.
`file_type`	`str`	`"svl"`	Annotation format. `svl` or compatible XML.
`audio_extension`	`str`	`".wav"`	File extension of audio recordings.
`n_fft`	`int`	`1024`	Hann window size in samples for the STFT.
`hop_length`	`int`	`256`	Stride between consecutive STFT frames in samples.
`n_mels`	`int`	`128`	Number of mel bands. Sets the spectrogram height $S_h$.
`f_min`	`int`	`4000`	Minimum frequency for the mel filter bank.
`f_max`	`int`	`9000`	Maximum frequency for the mel filter bank.

Per-species values from the paper are listed in Spectrogram preprocessing.

"preprocessing": {
  "sample_rate": 9600,
  "lowpass_cutoff": 2000,
  "downsample_rate": 4800,
  "nyquist_rate": 2400,
  "segment_duration": 4,
  "n_fft": 1024,
  "hop_length": 256,
  "n_mels": 128,
  "f_min": 0,
  "f_max": 2000,
  "file_type": "svl",
  "audio_extension": ".wav"
}

Full example: Hainan gibbon¶

Below is a runnable configuration that mirrors the published experiment for the Hainan gibbon dataset.

{
  "algorithm": { "max_generations": 20 },
  "population": { "pop_size": 300 },
  "selection_operator": { "tournament_size": 5 },
  "genetic_operator": {
    "mutation_rate": 0.3,
    "crossover_rate": 0.6,
    "reproduction_rate": 0.1
  },
  "gene": {
    "min_position": 0,
    "max_position": -1,
    "min_height": 1,
    "max_height": 16
  },
  "chromosome": {
    "min_num_genes": 1,
    "max_num_genes": 10,
    "lambda_1": 0.99,
    "lambda_2": 0.01,
    "stack": false
  },
  "model": {
    "optimizer_name": "adam",
    "loss_function_name": "cross_entropy",
    "num_epochs": 30,
    "batch_size": 64,
    "learning_rate": 0.001,
    "metric": "f1"
  },
  "cnn_architecture": {
    "conv_layers": 1,
    "conv_filters": 8,
    "conv_kernel": 8,
    "max_pooling_size": 4,
    "fc_layers": 2,
    "fc_units": 32,
    "dropout_rate": 0.5
  },
  "data": {
    "species_folder": "/data/Hainan_gibbon",
    "positive_class": "gibbon",
    "negative_class": "no-gibbon",
    "train_size": 0.6,
    "test_size": 0.2
  },
  "preprocessing": {
    "sample_rate": 9600,
    "lowpass_cutoff": 2000,
    "downsample_rate": 4800,
    "nyquist_rate": 2400,
    "segment_duration": 4,
    "n_fft": 1024,
    "hop_length": 256,
    "n_mels": 128,
    "f_min": 0,
    "f_max": 5000,
    "file_type": "svl",
    "audio_extension": ".wav"
  }
}

Auto-generated reference¶

For the raw dataclass definitions, see eso.utils.settings.

Field	Type	Default	Description
`min_position`	`int`	`0`	Lower bound on \(P_k\).
`max_position`	`int`	`-1`	Upper bound on \(P_k\). `-1` defaults to the spectrogram height \(S_h\).
`min_height`	`int`	`4`	Lower bound on \(h_k\).
`max_height`	`int`	`16`	Upper bound on \(h_k\).
`band_position`	`int \\| null`	`null`	Fix \(P_k\) to a single value for every gene. Use `null` or `-1` to disable.
`band_height`	`int \\| null`	`null`	Fix \(h_k\) to a single value for every gene. Use `null` or `-1` to disable.
`spec_height`	`int \\| null`	`null`	Spectrogram height \(S_h\). Filled in automatically from preprocessing.
`minimum_gene_height`	`int \\| null`	`null`	Minimum legal height given the convolution stack. Computed automatically.