Utilities¶

Preprocessing, annotation parsing, settings schema, evaluation, logging, and CPU-safe unpickling.

Symbol	File	Role
`Preprocessing`	`eso/utils/preprocessing.py`	Audio loading, optional filtering, mel-spectrogram generation.
`AnnotationReader`	`eso/utils/AnnotationReader.py`	Parse SVL or compatible XML annotation files.
`Config` and friends	`eso/utils/settings.py`	Typed configuration schema. One dataclass per section of the JSON.
`Evaluation`	`eso/utils/Evaluation.py`	Sliding-window inference, bout reconstruction, comparison metrics.
`plot_chromosome` · `setup_logger` · `log_tensorboard`	`eso/utils/logger.py`	Visualisation and logging helpers.
`CPU_Unpickler`	`eso/utils/unpickler.py`	Unpickle GPU-trained tensors onto CPU.

`eso.utils.preprocessing`¶

The audio-to-spectrogram pipeline. The class produces two datasets per species: a preprocessed one (low-pass filtered and downsampled, used to train the baseline) and an unprocessed one (used by ESO). Audio is segmented into fixed-length windows with a one-second overlap. Each segment is converted to a mel-spectrogram with a Hann window and a configurable hop length. Class balancing through time shifting, blending, and additive noise is also handled here.

AnnotationReader ¶

AnnotationReader(
    path: str,
    annotation_file_name: str,
    file_type: str,
    audio_extension: str,
    positive_class: str,
)

Source code in eso/utils/AnnotationReader.py

def __init__(
    self, 
    path : str, 
    annotation_file_name : str, 
    file_type : str, 
    audio_extension : str, 
    positive_class: str):


    self.path = path
    self.annotation_file_name = annotation_file_name
    self.file_type = file_type
    self.audio_extension = audio_extension
    self.positive_class=positive_class
    """
    Initializes the AnnotationReader class.

    Parameters
    ----------
    path : str
        The path to the directory containing the annotation and audio files.
    annotation_file_name : str
        The name of the annotation file (without extension) to be read.
    file_type : str
        The type of annotation file (e.g., "svl", "xml").
    audio_extension : str
        The file extension for the associated audio files (e.g., ".wav", ".mp3").
    positive_class : str
        The label representing the positive class in classification tasks.

    Returns
    -------
    None
    """

path `instance-attribute` ¶

path = path

annotation_file_name `instance-attribute` ¶

annotation_file_name = annotation_file_name

file_type `instance-attribute` ¶

file_type = file_type

audio_extension `instance-attribute` ¶

audio_extension = audio_extension

positive_class `instance-attribute` ¶

positive_class = positive_class

Initializes the AnnotationReader class.

Parameters:

Name	Type	Description	Default
`path`	`str`	The path to the directory containing the annotation and audio files.	required
`annotation_file_name`	`str`	The name of the annotation file (without extension) to be read.	required
`file_type`	`str`	The type of annotation file (e.g., "svl", "xml").	required
`audio_extension`	`str`	The file extension for the associated audio files (e.g., ".wav", ".mp3").	required
`positive_class`	`str`	The label representing the positive class in classification tasks.	required

Returns:

Type	Description
`None`

get_annotation_information ¶

get_annotation_information(annotation_folder, sufix_file)

Extract annotation information from an .svl XML file and return a DataFrame with start times, end times, and labels for the annotations.

This method parses an XML annotation file (.svl format) to extract annotation details including the start time, end time, and label for each annotation. It processes the XML file, handles any confidence values, and adjusts labels accordingly (e.g., using the positive class label for predicted annotations).

Parameters:

Name	Type	Description	Default
`annotation_folder`	`str`	The folder where the annotation file is located.	required
`sufix_file`	`str`	The suffix to append to the base annotation file name to get the full file name.	required

Returns:

Type	Description
`tuple`	A tuple containing: - pd.DataFrame: A DataFrame with three columns: - 'Start': The start time of the annotation in seconds. - 'End': The end time of the annotation in seconds. - 'Label': The label associated with the annotation. - str: The name of the corresponding audio file (with ".wav" extension).

Raises:

Type	Description
`Exception`	If the annotation file does not contain valid annotation information.

Source code in eso/utils/AnnotationReader.py

def get_annotation_information(self, annotation_folder, sufix_file ):
    """
    Extract annotation information from an `.svl` XML file and return a DataFrame
    with start times, end times, and labels for the annotations.

    This method parses an XML annotation file (`.svl` format) to extract annotation
    details including the start time, end time, and label for each annotation.
    It processes the XML file, handles any confidence values, and adjusts labels
    accordingly (e.g., using the positive class label for predicted annotations).

    Parameters
    ----------
    annotation_folder : str
        The folder where the annotation file is located.
    sufix_file : str
        The suffix to append to the base annotation file name to get the full file name.

    Returns
    -------
    tuple
        A tuple containing:
        - pd.DataFrame: A DataFrame with three columns:
            - 'Start': The start time of the annotation in seconds.
            - 'End': The end time of the annotation in seconds.
            - 'Label': The label associated with the annotation.
        - str: The name of the corresponding audio file (with ".wav" extension).

    Raises
    ------
    Exception
        If the annotation file does not contain valid annotation information.
    """

    path = str(Path(
            self.path, annotation_folder, self.annotation_file_name + sufix_file
        ))


    xmldoc = minidom.parse(path)
    itemlist = xmldoc.getElementsByTagName("point")
    idlist = xmldoc.getElementsByTagName("model")

    start_time = []
    end_time = []
    labels = []
    audio_file_name = ""

    if len(idlist) > 0:
        for s in idlist: 
            original_sample_rate = int(s.attributes["sampleRate"].value)


    if len(itemlist) > 0:

        # Iterate over each annotation in the .svl file (annotatation file)
        for s in itemlist:
            # Get the starting seconds from the annotation file. Must be an integer
            # so that the correct frame from the waveform can be extracted
            start_seconds = (
                    float(s.attributes["frame"].value) / original_sample_rate
                )

            # Get the label from the annotation file
            label = str(s.attributes["label"].value)

            # Set the default confidence to 10 (i.e. high confidence that
            # the label is correct). Annotations that do not have the idea
            # of 'confidence' are teated like normal annotations and it is
            # assumed that the annotation is correct (by the annotator).
            label_confidence = 10

            # Check if a confidence has been assigned
            if "," in label:
                # Extract the raw label
                lalel_string = label[: label.find(",") :]

                # Extract confidence value
                label_confidence = int(label[label.find(",") + 1 :])

                # Set the label to the raw label
                label = lalel_string

                # If a file has a blank label then skip this annotation
                # to avoid mislabelling data
            if label == "":
                break


            #to include predictions obtained from a model
            if label == "predicted" :
                label=self.positive_class

            # Only considered cases where the labels are very confident
            # 10 = very confident, 5 = medium, 1 = unsure this is represented
            # as "SPECIES:10", "SPECIES:5" when annotating.
            if label_confidence == 10:
                # Get the duration from the annotation file
                annotation_duration_seconds = (
                        float(s.attributes["duration"].value) / original_sample_rate
                    )
                start_time.append(start_seconds)
                end_time.append(start_seconds + annotation_duration_seconds)
                labels.append(label)

    df_svl_gibbons = pd.DataFrame(
            {"Start": start_time, "End": end_time, "Label": labels}
        )
    return df_svl_gibbons, self.annotation_file_name + ".wav"

get_annotation_information_testing ¶

get_annotation_information_testing()

Extract annotation information from a .svl XML file and return a DataFrame with frame, value, duration, extent, and label for each annotation.

This method parses an XML annotation file (.svl format) to extract detailed annotation information such as frame number, value, duration, extent, and label. It also extracts the sample rate, start time, and end time from the file's metadata.

Parameters:

Name	Type	Description	Default
`None`			required

Returns:

Type Description

tuple

A tuple containing: - pd.DataFrame: A DataFrame with columns: - 'frame': The frame number from the annotation. - 'value': The value associated with the annotation. - 'duration': The duration of the annotation. - 'extent': The extent of the annotation. - 'label': The label associated with the annotation. - int: The sample rate extracted from the .svl file. - str: The start time of the annotation in the .svl file. - str: The end time of the annotation in the .svl file.

Raises:

Type	Description
`Exception`	If the annotation file is not found or if it does not contain valid annotation information.

Source code in eso/utils/AnnotationReader.py

def get_annotation_information_testing(self):
    """
    Extract annotation information from a `.svl` XML file and return a DataFrame
    with frame, value, duration, extent, and label for each annotation.

    This method parses an XML annotation file (`.svl` format) to extract detailed
    annotation information such as frame number, value, duration, extent, and label.
    It also extracts the sample rate, start time, and end time from the file's metadata.

    Parameters
    ----------
    None

    Returns
    -------
    tuple
        A tuple containing:
        - pd.DataFrame: A DataFrame with columns:
            - 'frame': The frame number from the annotation.
            - 'value': The value associated with the annotation.
            - 'duration': The duration of the annotation.
            - 'extent': The extent of the annotation.
            - 'label': The label associated with the annotation.
        - int: The sample rate extracted from the `.svl` file.
        - str: The start time of the annotation in the `.svl` file.
        - str: The end time of the annotation in the `.svl` file.

    Raises
    ------
    Exception
        If the annotation file is not found or if it does not contain valid annotation information.
    """

    path = os.path.join(
            self.path, "Annotations", self.annotation_file_name + ".svl"
        )

    # Process the .svl xml file
    xmldoc = minidom.parse(path)
    itemlist = xmldoc.getElementsByTagName('point')
    idlist = xmldoc.getElementsByTagName('model')

    sampleRate = idlist.item(0).attributes['sampleRate'].value 
    start_m = idlist.item(0).attributes['start'].value
    end_m = idlist.item(0).attributes['end'].value


    values = []
    frames = []
    durations=[]
    extents=[]
    labels = []
    audio_file_name = ''

    if len(idlist) > 0:
        for s in idlist: 
            original_sample_rate = int(s.attributes["sampleRate"].value)

    if (len(itemlist) > 0):

    # Iterate over each annotation in the .svl file (annotatation file)
        for s in itemlist:

            # Get the starting seconds from the annotation file. Must be an integer
            # so that the correct frame from the waveform can be extracted
            frame = float(s.attributes['frame'].value)
            value = float(s.attributes['value'].value)
            duration = float(s.attributes['duration'].value)
            extent = float(s.attributes['extent'].value)
            label = str(s.attributes['label'].value)

            # Set the default confidence to 10 (i.e. high confidence that
            # the label is correct). Annotations that do not have the idea
            # of 'confidence' are teated like normal annotations and it is
            # assumed that the annotation is correct (by the annotator). 
            label_confidence = 10

            # Check if a confidence has been assigned
            if ',' in label:

                # Extract the raw label
                lalel_string = label[:label.find(','):]

                # Extract confidence value
                label_confidence = int(label[label.find(',')+1:])

                # Set the label to the raw label
                label = lalel_string


            # If a file has a blank label then skip this annotation
            # to avoid mislabelling data
            if label == '':
                break

            # Only considered cases where the labels are very confident
            # 10 = very confident, 5 = medium, 1 = unsure this is represented
            # as "SPECIES:10", "SPECIES:5" when annotating.
            if label_confidence == 10:

                frames.append(frame)
                values.append(value)
                durations.append(duration)
                extents.append(extent)
                labels.append(label)

    df_svl_gibbons = pd.DataFrame({'frame': frames, 'value':values ,'duration': durations,
                              'extent':extents,'label':labels})
    return df_svl_gibbons, sampleRate, start_m, end_m

dataframe_to_svl ¶

dataframe_to_svl(dataframe, sample_rate, start_m, end_m)

Convert a DataFrame of annotations to a .svl format XML string.

This method generates a .svl format XML string containing the annotations from a DataFrame. The generated XML includes metadata such as the sample rate, start time, end time, and annotation points (frame, value, duration, extent, and label).

Parameters:

Name	Type	Description	Default
`dataframe`	`DataFrame`	A DataFrame containing the annotation information. The DataFrame should have the following columns: 'frame', 'value', 'duration', 'extent', and 'label'.	required
`sample_rate`	`int`	The sample rate of the audio associated with the annotations.	required
`start_m`	`str`	The start time (in seconds) of the annotation period.	required
`end_m`	`str`	The end time (in seconds) of the annotation period.	required

Returns:

Type	Description
`str`	A string containing the XML in `.svl` format, representing the annotations along with metadata.

Notes

The function generates an XML document that includes: - <model>: metadata about the annotation model, including sample rate, start time, and end time. - <dataset>: contains <point> elements that represent individual annotations. - <display>: defines the display settings for the annotation in the software.

Source code in eso/utils/AnnotationReader.py

def dataframe_to_svl(self, dataframe, sample_rate, start_m, end_m):
    """
    Convert a DataFrame of annotations to a `.svl` format XML string.

    This method generates a `.svl` format XML string containing the annotations
    from a DataFrame. The generated XML includes metadata such as the sample rate,
    start time, end time, and annotation points (frame, value, duration, extent, and label).

    Parameters
    ----------
    dataframe : pd.DataFrame
        A DataFrame containing the annotation information. The DataFrame should have 
        the following columns: 'frame', 'value', 'duration', 'extent', and 'label'.
    sample_rate : int
        The sample rate of the audio associated with the annotations.
    start_m : str
        The start time (in seconds) of the annotation period.
    end_m : str
        The end time (in seconds) of the annotation period.

    Returns
    -------
    str
        A string containing the XML in `.svl` format, representing the annotations
        along with metadata.

    Notes
    -----
    The function generates an XML document that includes:
    - `<model>`: metadata about the annotation model, including sample rate, start time, and end time.
    - `<dataset>`: contains `<point>` elements that represent individual annotations.
    - `<display>`: defines the display settings for the annotation in the software.
    """
    doc, tag, text = Doc().tagtext()
    doc.asis('<?xml version="1.0" encoding="UTF-8"?>')
    doc.asis('<!DOCTYPE sonic-visualiser>')

    with tag('sv'):
        with tag('data'):

            model_string = '<model id="10" name="" sampleRate="{}" start="{}" end="{}" type="sparse" dimensions="2" resolution="1" notifyOnAdd="true" dataset="9" subtype="box" minimum="600" maximum="{}" units="Hz" />'.format(sample_rate, 
                                                                    start_m,
                                                                    end_m,
                                                                    1000)
            doc.asis(model_string)

        with tag('dataset', id='9', dimensions='2'):

            # Read dataframe or other data structure and add the values here
            # These are added as "point" elements, for example:
            # '<point frame="15360" value="3136.87" duration="1724416" extent="2139.22" label="Cape Robin" />'
            for index, row in dataframe.iterrows():

                point  = '<point frame="{}" value="{}" duration="{}" extent="{}" label="{}" />'.format(
                    int(row['frame']), 
                    row['value'],
                    int(row['duration']),
                    1500,
                    row['label'])

                # add the point
                doc.asis(point)
        with tag('display'):

            display_string = '<layer id="2" type="boxes" name="Boxes" model="10"  verticalScale="0"  colourName="White" colour="#ffffff" darkBackground="true" />'
            doc.asis(display_string)

    result = indent(
        doc.getvalue(),
        indentation = ' '*2,
        newline = '\r\n'
    )

    return result

Preprocessing ¶

Preprocessing(
    species_folder: str,
    sample_rate: int,
    lowpass_cutoff: int,
    downsample_rate: int,
    nyquist_rate: int,
    segment_duration: int,
    positive_class: str,
    negative_class: str,
    nb_negative_class: int,
    n_fft: int,
    hop_length: int,
    n_mels: int,
    f_min: int,
    f_max: int,
    file_type: str,
    audio_extension: str,
    apply_preprocessing: bool = True,
)

Initialize the Preprocessing object.

Parameters:

Name	Type	Description	Default
`species_folder`	`str`	Path to the species folder containing audio and annotation data.	required
`sample_rate`	`int`	The sample rate for unprocessed audio files.	required
`lowpass_cutoff`	`int`	The cutoff frequency for the low-pass filter.	required
`downsample_rate`	`int`	The rate at which to downsample the audio.	required
`nyquist_rate`	`int`	The Nyquist rate, half of the sampling rate.	required
`segment_duration`	`int`	Duration of each audio segment in seconds.	required
`positive_class`	`str`	Label representing the positive class in the dataset.	required
`negative_class`	`str`	Label representing the negative class in the dataset.	required
`nb_negative_class`	`int`	Number of negative class samples.	required
`n_fft`	`int`	The length of the FFT window for spectrograms.	required
`hop_length`	`int`	The hop length for generating spectrograms.	required
`n_mels`	`int`	The number of mel bands to use in the spectrogram.	required
`f_min`	`int`	The minimum frequency for the mel filter bank.	required
`f_max`	`int`	The maximum frequency for the mel filter bank.	required
`file_type`	`str`	The type of annotation files to process (e.g., '.svl').	required
`audio_extension`	`str`	The file extension for the audio files (e.g., '.wav').	required
`apply_preprocessing`	`bool`	Whether to apply preprocessing steps like filtering and downsampling. Default is True.	`True`

Returns:

Type	Description
`None`

Source code in eso/utils/preprocessing.py

def __init__(
    self,
    species_folder : str,
    sample_rate: int,
    lowpass_cutoff : int,
    downsample_rate : int,
    nyquist_rate : int,
    segment_duration : int,
    positive_class : str,
    negative_class : str,
    nb_negative_class : int,
    n_fft : int,
    hop_length : int,
    n_mels : int,
    f_min : int,
    f_max : int,
    file_type : str,
    audio_extension : str,
    apply_preprocessing: bool=True,

) -> None:
    """
    Initialize the Preprocessing object.

    Parameters
    ----------
    species_folder : str
        Path to the species folder containing audio and annotation data.
    sample_rate : int
        The sample rate for unprocessed audio files.
    lowpass_cutoff : int
        The cutoff frequency for the low-pass filter.
    downsample_rate : int
        The rate at which to downsample the audio.
    nyquist_rate : int
        The Nyquist rate, half of the sampling rate.
    segment_duration : int
        Duration of each audio segment in seconds.
    positive_class : str
        Label representing the positive class in the dataset.
    negative_class : str
        Label representing the negative class in the dataset.
    nb_negative_class : int
        Number of negative class samples.
    n_fft : int
        The length of the FFT window for spectrograms.
    hop_length : int
        The hop length for generating spectrograms.
    n_mels : int
        The number of mel bands to use in the spectrogram.
    f_min : int
        The minimum frequency for the mel filter bank.
    f_max : int
        The maximum frequency for the mel filter bank.
    file_type : str
        The type of annotation files to process (e.g., '.svl').
    audio_extension : str
        The file extension for the audio files (e.g., '.wav').
    apply_preprocessing : bool, optional
        Whether to apply preprocessing steps like filtering and downsampling. Default is True.

    Returns
    -------
    None
    """
    self.sample_rate_unpreprocessed=sample_rate
    self.species_folder = species_folder
    self.lowpass_cutoff = lowpass_cutoff
    self.downsample_rate = downsample_rate
    self.nyquist_rate = nyquist_rate
    self.segment_duration = segment_duration
    self.positive_class = positive_class
    self.negative_class = negative_class
    self.nb_negative_class = nb_negative_class
    self.audio_path = Path(self.species_folder, "Audio")
    self.annotations_path = Path(self.species_folder, "Annotations")
    self.saved_data_path = Path(self.species_folder, "SavedData")
    self.training_files = Path(self.species_folder, "DataFiles", "TrainingFiles.txt")      
    self.n_mels = n_mels
    self.f_min = f_min
    self.f_max = f_max
    self.file_type = file_type
    self.audio_extension = audio_extension
    self.apply_preprocessing = apply_preprocessing
    self.n_fft = n_fft
    self.hop_length = hop_length

sample_rate_unpreprocessed `instance-attribute` ¶

sample_rate_unpreprocessed = sample_rate

species_folder `instance-attribute` ¶

species_folder = species_folder

lowpass_cutoff `instance-attribute` ¶

lowpass_cutoff = lowpass_cutoff

downsample_rate `instance-attribute` ¶

downsample_rate = downsample_rate

nyquist_rate `instance-attribute` ¶

nyquist_rate = nyquist_rate

segment_duration `instance-attribute` ¶

segment_duration = segment_duration

positive_class `instance-attribute` ¶

positive_class = positive_class

negative_class `instance-attribute` ¶

negative_class = negative_class

nb_negative_class `instance-attribute` ¶

nb_negative_class = nb_negative_class

audio_path `instance-attribute` ¶

audio_path = Path(species_folder, 'Audio')

annotations_path `instance-attribute` ¶

annotations_path = Path(species_folder, 'Annotations')

saved_data_path `instance-attribute` ¶

saved_data_path = Path(species_folder, 'SavedData')

training_files `instance-attribute` ¶

training_files = Path(species_folder, 'DataFiles', 'TrainingFiles.txt')

n_mels `instance-attribute` ¶

n_mels = n_mels

f_min `instance-attribute` ¶

f_min = f_min

f_max `instance-attribute` ¶

f_max = f_max

file_type `instance-attribute` ¶

file_type = file_type

audio_extension `instance-attribute` ¶

audio_extension = audio_extension

apply_preprocessing `instance-attribute` ¶

apply_preprocessing = apply_preprocessing

n_fft `instance-attribute` ¶

n_fft = n_fft

hop_length `instance-attribute` ¶

hop_length = hop_length

read_audio_file ¶

read_audio_file(file_name)

Load an audio file and return its waveform and sample rate.

Parameters:

Name	Type	Description	Default
`file_name`	`str`	Name of the audio file including the extension (e.g., "audio1.wav").	required

Returns:

Type	Description
`tuple`	A tuple containing: - np.ndarray: The audio waveform (amplitude values). - int: The sampling rate of the audio file.

Source code in eso/utils/preprocessing.py

def read_audio_file(self, file_name):
    """
    Load an audio file and return its waveform and sample rate.

    Parameters
    ----------
    file_name : str
        Name of the audio file including the extension (e.g., "audio1.wav").

    Returns
    -------
    tuple
        A tuple containing:
        - np.ndarray: The audio waveform (amplitude values).
        - int: The sampling rate of the audio file.
    """
    # Get the path to the file
    audio_folder = Path(file_name)

    # Read the amplitudes and sample rate
    audio_amps, audio_sample_rate = librosa.load(audio_folder, sr=None)

    return audio_amps, audio_sample_rate

butter_lowpass_filter ¶

butter_lowpass_filter(data, cutoff_freq, nyq_freq, order=4)

Apply a Butterworth low-pass filter to the input signal.

This method filters the input signal using a zero-phase Butterworth low-pass filter designed with the specified cutoff and Nyquist frequencies.

Parameters:

Name	Type	Description	Default
`data`	`ndarray`	The input signal (1D array) to be filtered.	required
`cutoff_freq`	`float`	The cutoff frequency of the low-pass filter (in Hz).	required
`nyq_freq`	`float`	The Nyquist frequency (typically half the sampling rate).	required
`order`	`int`	The order of the Butterworth filter. Default is 4.	`4`

Returns:

Type	Description
`ndarray`	The filtered signal with the same shape as the input.

Source code in eso/utils/preprocessing.py

def butter_lowpass_filter(self, data, cutoff_freq, nyq_freq, order=4):
    """
    Apply a Butterworth low-pass filter to the input signal.

    This method filters the input signal using a zero-phase Butterworth low-pass
    filter designed with the specified cutoff and Nyquist frequencies.

    Parameters
    ----------
    data : np.ndarray
        The input signal (1D array) to be filtered.
    cutoff_freq : float
        The cutoff frequency of the low-pass filter (in Hz).
    nyq_freq : float
        The Nyquist frequency (typically half the sampling rate).
    order : int, optional
        The order of the Butterworth filter. Default is 4.

    Returns
    -------
    np.ndarray
        The filtered signal with the same shape as the input.
    """ 
    # Source: https://github.com/guillaume-chevalier/filtering-stft-and-laplace-transform
    b, a = self._butter_lowpass(cutoff_freq, nyq_freq, order=order)
    y = signal.filtfilt(b, a, data)
    return y

downsample_file ¶

downsample_file(amplitudes, original_sr, new_sample_rate)

Downsample an audio waveform to a specified sample rate.

This function resamples the input audio from the original sample rate to a new, lower sample rate using the 'kaiser_fast' resampling method.

Parameters:

Name	Type	Description	Default
`amplitudes`	`ndarray`	The raw audio waveform (1D NumPy array of amplitude values).	required
`original_sr`	`int`	The original sampling rate of the audio signal (in Hz).	required
`new_sample_rate`	`int`	The desired sampling rate to downsample the audio to (in Hz).	required

Returns:

Type	Description
`tuple`	A tuple containing: - np.ndarray: The downsampled audio waveform. - int: The new sampling rate (same as `new_sample_rate`).

Source code in eso/utils/preprocessing.py

def downsample_file(self, amplitudes, original_sr, new_sample_rate):
    """
    Downsample an audio waveform to a specified sample rate.

    This function resamples the input audio from the original sample rate
    to a new, lower sample rate using the 'kaiser_fast' resampling method.

    Parameters
    ----------
    amplitudes : np.ndarray
        The raw audio waveform (1D NumPy array of amplitude values).
    original_sr : int
        The original sampling rate of the audio signal (in Hz).
    new_sample_rate : int
        The desired sampling rate to downsample the audio to (in Hz).

    Returns
    -------
    tuple
        A tuple containing:
        - np.ndarray: The downsampled audio waveform.
        - int: The new sampling rate (same as `new_sample_rate`).
    """
    return (
        librosa.resample(
            amplitudes,
            orig_sr=original_sr,
            target_sr=new_sample_rate,
            res_type="kaiser_fast",
        ),
        new_sample_rate,
    )

convert_single_to_image ¶

convert_single_to_image(audio, sample_rate)

Convert an audio waveform into a normalized mel-spectrogram image.

This function computes the mel-spectrogram from a raw audio signal and applies normalization to scale the spectrogram values between 0 and 1. If preprocessing is enabled, user-defined frequency limits are used; otherwise, default frequency bounds are applied.

Parameters:

Name	Type	Description	Default
`audio`	`ndarray`	The raw audio waveform (1D NumPy array of amplitude values).	required
`sample_rate`	`int`	The sampling rate of the audio signal (in Hz).	required

Returns:

Type	Description
`ndarray`	A 2D NumPy array representing the normalized mel-spectrogram image.

Source code in eso/utils/preprocessing.py

def convert_single_to_image(self, audio, sample_rate):
    """
    Convert an audio waveform into a normalized mel-spectrogram image.

    This function computes the mel-spectrogram from a raw audio signal and 
    applies normalization to scale the spectrogram values between 0 and 1.
    If preprocessing is enabled, user-defined frequency limits are used;
    otherwise, default frequency bounds are applied.

    Parameters
    ----------
    audio : np.ndarray
        The raw audio waveform (1D NumPy array of amplitude values).
    sample_rate : int
        The sampling rate of the audio signal (in Hz).

    Returns
    -------
    np.ndarray
        A 2D NumPy array representing the normalized mel-spectrogram image.
    """
    if not self.apply_preprocessing:
        f_min = 0
        f_max = 5000
    else:
        f_min = self.f_min
        f_max = self.f_max

    S = librosa.feature.melspectrogram(
        y=audio,
        sr=sample_rate,
        n_fft=self.n_fft,
        hop_length=self.hop_length,
        n_mels=self.n_mels,
        fmin=f_min,
        fmax=f_max,
    )


    image = librosa.core.power_to_db(S)
    image_np = np.asmatrix(image)
    image_np_scaled_temp = image_np - np.min(image_np)
    image_np_scaled = image_np_scaled_temp / np.max(image_np_scaled_temp)
    mean = image.flatten().mean()
    std = image.flatten().std()
    eps = 1e-8
    spec_norm = (image - mean) / (std + eps)
    spec_min, spec_max = spec_norm.min(), spec_norm.max()
    spec_scaled = (spec_norm - spec_min) / (spec_max - spec_min)
    S1 = spec_scaled

    return S1

save_data_to_pickle ¶

save_data_to_pickle(X, Y)

Save the input data and labels to pickle files.

This function saves the spectrogram data (X) and their corresponding labels (Y) into separate pickle files (X.pkl and Y.pkl) in the directory specified by self.saved_data_path.

Parameters:

Name	Type	Description	Default
`X`	`any`	The data to be saved (e.g., spectrograms). Must be pickle-serializable.	required
`Y`	`any`	The corresponding labels for `X`. Must also be pickle-serializable.	required

Returns:

Type	Description
`None`

Source code in eso/utils/preprocessing.py

def save_data_to_pickle(self, X, Y):
    """
    Save the input data and labels to pickle files.

    This function saves the spectrogram data (`X`) and their corresponding
    labels (`Y`) into separate pickle files (`X.pkl` and `Y.pkl`) in the directory 
    specified by `self.saved_data_path`.

    Parameters
    ----------
    X : any
        The data to be saved (e.g., spectrograms). Must be pickle-serializable.
    Y : any
        The corresponding labels for `X`. Must also be pickle-serializable.

    Returns
    -------
    None
    """
    outfile = open(Path(self.saved_data_path, "X.pkl"), "wb")
    pickle.dump(X, outfile, protocol=4)
    outfile.close()

    outfile = open(Path(self.saved_data_path, "Y.pkl"), "wb")
    pickle.dump(Y, outfile, protocol=4)
    outfile.close()

load_data_from_pickle ¶

load_data_from_pickle()

Load the data and labels from pickle files.

This function loads spectrogram data (X) and their corresponding labels (Y) from pickle files (X.pkl and Y.pkl) located in the directory specified by self.saved_data_path.

Returns:

Name	Type	Description
`X`	`any`	The loaded data (e.g., spectrograms), as previously saved using `save_data_to_pickle`.
`Y`	`any`	The corresponding labels for `X`.

Source code in eso/utils/preprocessing.py

def load_data_from_pickle(self):
    """
    Load the data and labels from pickle files.

    This function loads spectrogram data (`X`) and their corresponding
    labels (`Y`) from pickle files (`X.pkl` and `Y.pkl`) located in the directory 
    specified by `self.saved_data_path`.

    Returns
    -------
    X : any
        The loaded data (e.g., spectrograms), as previously saved using `save_data_to_pickle`.
    Y : any
        The corresponding labels for `X`.
    """
    infile = open(Path(self.saved_data_path, "X.pkl"), "rb")
    X = pickle.load(infile)
    infile.close()

    infile = open(Path(self.saved_data_path, "Y.pkl"), "rb")
    Y = pickle.load(infile)
    infile.close()

    return X, Y

create_dataset ¶

create_dataset(annotation_folder, sufix_file, file_names=None, augmentation=False)

Create the dataset of audio segments and labels for machine learning.

This function reads audio files and their corresponding annotation files, applies preprocessing (optional low-pass filtering and downsampling), extracts labeled audio segments, and optionally augments the data to balance class distributions.

Parameters:

Name	Type	Description	Default
`annotation_folder`	`str or Path`	Path to the folder containing the `.svl` annotation files.	required
`sufix_file`	`str`	Suffix to append to the annotation filenames for retrieval.	required
`file_names`	`str or Path`	Path to a CSV file containing a list of filenames to process (without extensions). If None, uses `self.training_files`.	`None`
`augmentation`	`bool`	Whether to perform data augmentation to balance the dataset.	`False`

Returns:

Type	Description
`tuple of np.ndarray`	`X_calls` : ndarray of shape (n_samples, ...) Array of preprocessed and optionally augmented audio segments, typically converted into spectrogram images. `Y_calls` : ndarray of shape (n_samples,) Corresponding class labels for each segment (binary or multi-class).

Raises:

Type	Description
`ValueError`	If the `file_names` CSV is missing or empty.

Notes

Annotations are expected in .svl format, created with Sonic Visualiser, using the "boxes area" annotation layer.
Each annotation provides a labeled time segment which is then transformed into a training example.
Augmentation methods include time shifting, noise addition, and mixing with negative samples to improve dataset balance.

Source code in eso/utils/preprocessing.py

def create_dataset(self, annotation_folder, sufix_file, file_names=None, augmentation=False):
    """
    Create the dataset of audio segments and labels for machine learning.

    This function reads audio files and their corresponding annotation files,
    applies preprocessing (optional low-pass filtering and downsampling),
    extracts labeled audio segments, and optionally augments the data to
    balance class distributions.

    Parameters
    ----------
    annotation_folder : str or Path
        Path to the folder containing the `.svl` annotation files.
    sufix_file : str
        Suffix to append to the annotation filenames for retrieval.
    file_names : str or Path, optional
        Path to a CSV file containing a list of filenames to process (without extensions).
        If None, uses `self.training_files`.
    augmentation : bool, optional
        Whether to perform data augmentation to balance the dataset.

    Returns
    -------
    tuple of np.ndarray
        - `X_calls` : ndarray of shape (n_samples, ...)
            Array of preprocessed and optionally augmented audio segments,
            typically converted into spectrogram images.
        - `Y_calls` : ndarray of shape (n_samples,)
            Corresponding class labels for each segment (binary or multi-class).

    Raises
    ------
    ValueError
        If the `file_names` CSV is missing or empty.

    Notes
    -----
    - Annotations are expected in `.svl` format, created with Sonic Visualiser,
    using the "boxes area" annotation layer.
    - Each annotation provides a labeled time segment which is then transformed
    into a training example.
    - Augmentation methods include time shifting, noise addition, and mixing
    with negative samples to improve dataset balance.
    """

    if file_names is None:
        file_names = self.training_files
    # Keep track of how many calls were found in the annotation files
    total_calls = 0

    # Initialise lists to store the X and Y values
    X_calls = []
    Y_calls = []

    # Read all names of the files
    try:
        files = pd.read_csv(file_names, header=None)
    except Exception:
        raise ValueError(
            f"Error loading filenames from {file_names}. Check if File is not empty."
        )
    # Iterate over each annotation file
    for file in files.values:
        file = file[0]

        file_name_no_extension = file

        reader = AnnotationReader(self.species_folder,file, self.file_type, self.audio_extension, self.positive_class
        )
        # Check if the audio file exists before processing
        if str(
            Path(self.audio_path, file_name_no_extension + self.audio_extension)
        ) in glob(str(self.audio_path / f"*{self.audio_extension}")):

            # Read audio file
            audio_amps, original_sample_rate = self.read_audio_file(
                str(
                    Path(
                        self.audio_path,
                        file_name_no_extension + self.audio_extension,
                    )
                )
            )

            if self.apply_preprocessing:
                # Low pass filter
                filtered = self.butter_lowpass_filter(
                    audio_amps, self.lowpass_cutoff, self.nyquist_rate
                )
                # Downsample
                amplitudes, sample_rate = self.downsample_file(
                    filtered, original_sample_rate, self.downsample_rate
                )
                del filtered

            else:

                if original_sample_rate!=self.sample_rate_unpreprocessed: 
                    amplitudes, sample_rate = self.downsample_file(
                    audio_amps, original_sample_rate, self.sample_rate_unpreprocessed
                )
                else :
                    amplitudes, sample_rate = audio_amps, original_sample_rate

            del audio_amps
            df, audio_file_name = reader.get_annotation_information(annotation_folder, sufix_file)


            for index, row in df.iterrows():
                start_seconds = int(round(row["Start"]))
                end_seconds = int(round(row["End"]))
                label = row["Label"]
                annotation_duration_seconds = end_seconds - start_seconds

                # Extract augmented audio segments and corresponding binary labels
                X_data, y_data = self._getXY(
                    amplitudes,
                    sample_rate,
                    start_seconds,
                    annotation_duration_seconds,
                    label
                )

                # Append the segments and labels
                X_calls.extend(X_data)
                Y_calls.extend(y_data)



    if augmentation:
        # Augment dataset to get a balance dataset
        X_calls, Y_calls = self._augment_dataset(X_calls, Y_calls)


    X_calls = self._convert_all_to_image(X_calls, sample_rate)

    # Convert to numpy arrays
    X_calls, Y_calls = np.asarray(X_calls), np.asarray(Y_calls)

    return X_calls, Y_calls

shuffle_files_names ¶

shuffle_files_names(train_size=0.8, test_size=0.1, validation_size=0.1)

Shuffle audio file names and split them into training, testing, and validation sets.

This method scans the Audio folder inside the species directory for all files with the specified audio extension. It then randomly shuffles and splits the file names into training, testing, and validation sets according to the specified proportions. The resulting file names (without extensions) are saved as text files (train.txt, test.txt, validation.txt) inside the DataFiles subdirectory of the species folder.

Parameters:

Name	Type	Description	Default
`train_size`	`float`	Proportion of files to use for training. Default is 0.8.	`0.8`
`test_size`	`float`	Proportion of files to use for testing. Default is 0.1.	`0.1`
`validation_size`	`float`	Proportion of files to use for validation. Default is 0.1.	`0.1`

Raises:

Type	Description
`Exception`	If no audio files are found in the specified audio directory.

Notes

The sum of train_size, test_size, and validation_size should be 1.0.
Output files are saved as plain text, with one file name (without extension) per line.
The audio extension is read from self.audio_extension, and the species folder from self.species_folder.

Source code in eso/utils/preprocessing.py

def shuffle_files_names(self, train_size=0.8, test_size=0.1, validation_size=0.1):
    """
    Shuffle audio file names and split them into training, testing, and validation sets.

    This method scans the `Audio` folder inside the species directory for all
    files with the specified audio extension. It then randomly shuffles and splits
    the file names into training, testing, and validation sets according to the 
    specified proportions. The resulting file names (without extensions) are saved
    as text files (`train.txt`, `test.txt`, `validation.txt`) inside the `DataFiles`
    subdirectory of the species folder.

    Parameters
    ----------
    train_size : float, optional
        Proportion of files to use for training. Default is 0.8.
    test_size : float, optional
        Proportion of files to use for testing. Default is 0.1.
    validation_size : float, optional
        Proportion of files to use for validation. Default is 0.1.

    Raises
    ------
    Exception
        If no audio files are found in the specified audio directory.

    Notes
    -----
    - The sum of `train_size`, `test_size`, and `validation_size` should be 1.0.
    - Output files are saved as plain text, with one file name (without extension) per line.
    - The audio extension is read from `self.audio_extension`, and the species folder
    from `self.species_folder`.
    """        
    # Get all file names in Audio folder
    path = Path(self.species_folder, "Audio", f"*{self.audio_extension}")
    files = glob(str(path))

    if len(files) == 0:
        raise Exception(
            f"No audio files found in {self.species_folder}/Audio.\
            Please check the audio_extension setting in the settings file."
        )
    # Shuffle the files
    np.random.shuffle(files)

    train_samples = int(np.floor(len(files) * train_size))
    test_samples = int(np.floor(len(files) * test_size))

    # Split the files into train, test, validation
    train_split = train_samples
    test_split = test_samples

    train_files = files[:train_split]
    test_files = files[train_split : train_split + test_split]
    # Use the rest for validation
    validation_files = files[train_split + test_split :]

    # Only get the file names
    train_files = [os.path.basename(file) for file in train_files]
    test_files = [os.path.basename(file) for file in test_files]
    validation_files = [os.path.basename(file) for file in validation_files]

    # Remove the file extension
    train_files = [os.path.splitext(file)[0] for file in train_files]
    test_files = [os.path.splitext(file)[0] for file in test_files]
    validation_files = [os.path.splitext(file)[0] for file in validation_files]

    # Create the folders
    os.makedirs(Path(self.species_folder, "DataFiles"), exist_ok=True)

    # Save the files as .txt
    with open(Path(self.species_folder, "DataFiles", "train.txt"), "w") as f:
        f.write("\n".join(train_files))
    with open(os.path.join(self.species_folder, "DataFiles", "test.txt"), "w") as f:
        f.write("\n".join(test_files))

    with open(Path(self.species_folder, "DataFiles", "validation.txt"), "w") as f:
        f.write("\n".join(validation_files))

check_distribution ¶

check_distribution(Y)

Source code in eso/utils/preprocessing.py

def check_distribution(self, Y):
    unique, counts = np.unique(Y, return_counts=True)
    original_distribution = dict(zip(unique, counts))
    return original_distribution

`eso.utils.AnnotationReader`¶

Parses Sonic Visualiser SVL files and equivalent XML annotation formats into a DataFrame of (filename, start_time, end_time, label) rows. The output is consumed by Preprocessing to mark presence and absence segments for training.

AnnotationReader ¶

AnnotationReader(
    path: str,
    annotation_file_name: str,
    file_type: str,
    audio_extension: str,
    positive_class: str,
)

Source code in eso/utils/AnnotationReader.py

def __init__(
    self, 
    path : str, 
    annotation_file_name : str, 
    file_type : str, 
    audio_extension : str, 
    positive_class: str):


    self.path = path
    self.annotation_file_name = annotation_file_name
    self.file_type = file_type
    self.audio_extension = audio_extension
    self.positive_class=positive_class
    """
    Initializes the AnnotationReader class.

    Parameters
    ----------
    path : str
        The path to the directory containing the annotation and audio files.
    annotation_file_name : str
        The name of the annotation file (without extension) to be read.
    file_type : str
        The type of annotation file (e.g., "svl", "xml").
    audio_extension : str
        The file extension for the associated audio files (e.g., ".wav", ".mp3").
    positive_class : str
        The label representing the positive class in classification tasks.

    Returns
    -------
    None
    """

path `instance-attribute` ¶

path = path

annotation_file_name `instance-attribute` ¶

annotation_file_name = annotation_file_name

file_type `instance-attribute` ¶

file_type = file_type

audio_extension `instance-attribute` ¶

audio_extension = audio_extension

positive_class `instance-attribute` ¶

positive_class = positive_class

Initializes the AnnotationReader class.

Parameters:

Name	Type	Description	Default
`path`	`str`	The path to the directory containing the annotation and audio files.	required
`annotation_file_name`	`str`	The name of the annotation file (without extension) to be read.	required
`file_type`	`str`	The type of annotation file (e.g., "svl", "xml").	required
`audio_extension`	`str`	The file extension for the associated audio files (e.g., ".wav", ".mp3").	required
`positive_class`	`str`	The label representing the positive class in classification tasks.	required

Returns:

Type	Description
`None`

get_annotation_information ¶

get_annotation_information(annotation_folder, sufix_file)

Extract annotation information from an .svl XML file and return a DataFrame with start times, end times, and labels for the annotations.

This method parses an XML annotation file (.svl format) to extract annotation details including the start time, end time, and label for each annotation. It processes the XML file, handles any confidence values, and adjusts labels accordingly (e.g., using the positive class label for predicted annotations).

Parameters:

Name	Type	Description	Default
`annotation_folder`	`str`	The folder where the annotation file is located.	required
`sufix_file`	`str`	The suffix to append to the base annotation file name to get the full file name.	required

Returns:

Type	Description
`tuple`	A tuple containing: - pd.DataFrame: A DataFrame with three columns: - 'Start': The start time of the annotation in seconds. - 'End': The end time of the annotation in seconds. - 'Label': The label associated with the annotation. - str: The name of the corresponding audio file (with ".wav" extension).

Raises:

Type	Description
`Exception`	If the annotation file does not contain valid annotation information.

Source code in eso/utils/AnnotationReader.py

def get_annotation_information(self, annotation_folder, sufix_file ):
    """
    Extract annotation information from an `.svl` XML file and return a DataFrame
    with start times, end times, and labels for the annotations.

    This method parses an XML annotation file (`.svl` format) to extract annotation
    details including the start time, end time, and label for each annotation.
    It processes the XML file, handles any confidence values, and adjusts labels
    accordingly (e.g., using the positive class label for predicted annotations).

    Parameters
    ----------
    annotation_folder : str
        The folder where the annotation file is located.
    sufix_file : str
        The suffix to append to the base annotation file name to get the full file name.

    Returns
    -------
    tuple
        A tuple containing:
        - pd.DataFrame: A DataFrame with three columns:
            - 'Start': The start time of the annotation in seconds.
            - 'End': The end time of the annotation in seconds.
            - 'Label': The label associated with the annotation.
        - str: The name of the corresponding audio file (with ".wav" extension).

    Raises
    ------
    Exception
        If the annotation file does not contain valid annotation information.
    """

    path = str(Path(
            self.path, annotation_folder, self.annotation_file_name + sufix_file
        ))


    xmldoc = minidom.parse(path)
    itemlist = xmldoc.getElementsByTagName("point")
    idlist = xmldoc.getElementsByTagName("model")

    start_time = []
    end_time = []
    labels = []
    audio_file_name = ""

    if len(idlist) > 0:
        for s in idlist: 
            original_sample_rate = int(s.attributes["sampleRate"].value)


    if len(itemlist) > 0:

        # Iterate over each annotation in the .svl file (annotatation file)
        for s in itemlist:
            # Get the starting seconds from the annotation file. Must be an integer
            # so that the correct frame from the waveform can be extracted
            start_seconds = (
                    float(s.attributes["frame"].value) / original_sample_rate
                )

            # Get the label from the annotation file
            label = str(s.attributes["label"].value)

            # Set the default confidence to 10 (i.e. high confidence that
            # the label is correct). Annotations that do not have the idea
            # of 'confidence' are teated like normal annotations and it is
            # assumed that the annotation is correct (by the annotator).
            label_confidence = 10

            # Check if a confidence has been assigned
            if "," in label:
                # Extract the raw label
                lalel_string = label[: label.find(",") :]

                # Extract confidence value
                label_confidence = int(label[label.find(",") + 1 :])

                # Set the label to the raw label
                label = lalel_string

                # If a file has a blank label then skip this annotation
                # to avoid mislabelling data
            if label == "":
                break


            #to include predictions obtained from a model
            if label == "predicted" :
                label=self.positive_class

            # Only considered cases where the labels are very confident
            # 10 = very confident, 5 = medium, 1 = unsure this is represented
            # as "SPECIES:10", "SPECIES:5" when annotating.
            if label_confidence == 10:
                # Get the duration from the annotation file
                annotation_duration_seconds = (
                        float(s.attributes["duration"].value) / original_sample_rate
                    )
                start_time.append(start_seconds)
                end_time.append(start_seconds + annotation_duration_seconds)
                labels.append(label)

    df_svl_gibbons = pd.DataFrame(
            {"Start": start_time, "End": end_time, "Label": labels}
        )
    return df_svl_gibbons, self.annotation_file_name + ".wav"

get_annotation_information_testing ¶

get_annotation_information_testing()

Extract annotation information from a .svl XML file and return a DataFrame with frame, value, duration, extent, and label for each annotation.

This method parses an XML annotation file (.svl format) to extract detailed annotation information such as frame number, value, duration, extent, and label. It also extracts the sample rate, start time, and end time from the file's metadata.

Parameters:

Name	Type	Description	Default
`None`			required

Returns:

Type Description

tuple

A tuple containing: - pd.DataFrame: A DataFrame with columns: - 'frame': The frame number from the annotation. - 'value': The value associated with the annotation. - 'duration': The duration of the annotation. - 'extent': The extent of the annotation. - 'label': The label associated with the annotation. - int: The sample rate extracted from the .svl file. - str: The start time of the annotation in the .svl file. - str: The end time of the annotation in the .svl file.

Raises:

Type	Description
`Exception`	If the annotation file is not found or if it does not contain valid annotation information.

Source code in eso/utils/AnnotationReader.py

def get_annotation_information_testing(self):
    """
    Extract annotation information from a `.svl` XML file and return a DataFrame
    with frame, value, duration, extent, and label for each annotation.

    This method parses an XML annotation file (`.svl` format) to extract detailed
    annotation information such as frame number, value, duration, extent, and label.
    It also extracts the sample rate, start time, and end time from the file's metadata.

    Parameters
    ----------
    None

    Returns
    -------
    tuple
        A tuple containing:
        - pd.DataFrame: A DataFrame with columns:
            - 'frame': The frame number from the annotation.
            - 'value': The value associated with the annotation.
            - 'duration': The duration of the annotation.
            - 'extent': The extent of the annotation.
            - 'label': The label associated with the annotation.
        - int: The sample rate extracted from the `.svl` file.
        - str: The start time of the annotation in the `.svl` file.
        - str: The end time of the annotation in the `.svl` file.

    Raises
    ------
    Exception
        If the annotation file is not found or if it does not contain valid annotation information.
    """

    path = os.path.join(
            self.path, "Annotations", self.annotation_file_name + ".svl"
        )

    # Process the .svl xml file
    xmldoc = minidom.parse(path)
    itemlist = xmldoc.getElementsByTagName('point')
    idlist = xmldoc.getElementsByTagName('model')

    sampleRate = idlist.item(0).attributes['sampleRate'].value 
    start_m = idlist.item(0).attributes['start'].value
    end_m = idlist.item(0).attributes['end'].value


    values = []
    frames = []
    durations=[]
    extents=[]
    labels = []
    audio_file_name = ''

    if len(idlist) > 0:
        for s in idlist: 
            original_sample_rate = int(s.attributes["sampleRate"].value)

    if (len(itemlist) > 0):

    # Iterate over each annotation in the .svl file (annotatation file)
        for s in itemlist:

            # Get the starting seconds from the annotation file. Must be an integer
            # so that the correct frame from the waveform can be extracted
            frame = float(s.attributes['frame'].value)
            value = float(s.attributes['value'].value)
            duration = float(s.attributes['duration'].value)
            extent = float(s.attributes['extent'].value)
            label = str(s.attributes['label'].value)

            # Set the default confidence to 10 (i.e. high confidence that
            # the label is correct). Annotations that do not have the idea
            # of 'confidence' are teated like normal annotations and it is
            # assumed that the annotation is correct (by the annotator). 
            label_confidence = 10

            # Check if a confidence has been assigned
            if ',' in label:

                # Extract the raw label
                lalel_string = label[:label.find(','):]

                # Extract confidence value
                label_confidence = int(label[label.find(',')+1:])

                # Set the label to the raw label
                label = lalel_string


            # If a file has a blank label then skip this annotation
            # to avoid mislabelling data
            if label == '':
                break

            # Only considered cases where the labels are very confident
            # 10 = very confident, 5 = medium, 1 = unsure this is represented
            # as "SPECIES:10", "SPECIES:5" when annotating.
            if label_confidence == 10:

                frames.append(frame)
                values.append(value)
                durations.append(duration)
                extents.append(extent)
                labels.append(label)

    df_svl_gibbons = pd.DataFrame({'frame': frames, 'value':values ,'duration': durations,
                              'extent':extents,'label':labels})
    return df_svl_gibbons, sampleRate, start_m, end_m

dataframe_to_svl ¶

dataframe_to_svl(dataframe, sample_rate, start_m, end_m)

Convert a DataFrame of annotations to a .svl format XML string.

This method generates a .svl format XML string containing the annotations from a DataFrame. The generated XML includes metadata such as the sample rate, start time, end time, and annotation points (frame, value, duration, extent, and label).

Parameters:

Name	Type	Description	Default
`dataframe`	`DataFrame`	A DataFrame containing the annotation information. The DataFrame should have the following columns: 'frame', 'value', 'duration', 'extent', and 'label'.	required
`sample_rate`	`int`	The sample rate of the audio associated with the annotations.	required
`start_m`	`str`	The start time (in seconds) of the annotation period.	required
`end_m`	`str`	The end time (in seconds) of the annotation period.	required

Returns:

Type	Description
`str`	A string containing the XML in `.svl` format, representing the annotations along with metadata.

Notes

The function generates an XML document that includes: - <model>: metadata about the annotation model, including sample rate, start time, and end time. - <dataset>: contains <point> elements that represent individual annotations. - <display>: defines the display settings for the annotation in the software.

Source code in eso/utils/AnnotationReader.py

def dataframe_to_svl(self, dataframe, sample_rate, start_m, end_m):
    """
    Convert a DataFrame of annotations to a `.svl` format XML string.

    This method generates a `.svl` format XML string containing the annotations
    from a DataFrame. The generated XML includes metadata such as the sample rate,
    start time, end time, and annotation points (frame, value, duration, extent, and label).

    Parameters
    ----------
    dataframe : pd.DataFrame
        A DataFrame containing the annotation information. The DataFrame should have 
        the following columns: 'frame', 'value', 'duration', 'extent', and 'label'.
    sample_rate : int
        The sample rate of the audio associated with the annotations.
    start_m : str
        The start time (in seconds) of the annotation period.
    end_m : str
        The end time (in seconds) of the annotation period.

    Returns
    -------
    str
        A string containing the XML in `.svl` format, representing the annotations
        along with metadata.

    Notes
    -----
    The function generates an XML document that includes:
    - `<model>`: metadata about the annotation model, including sample rate, start time, and end time.
    - `<dataset>`: contains `<point>` elements that represent individual annotations.
    - `<display>`: defines the display settings for the annotation in the software.
    """
    doc, tag, text = Doc().tagtext()
    doc.asis('<?xml version="1.0" encoding="UTF-8"?>')
    doc.asis('<!DOCTYPE sonic-visualiser>')

    with tag('sv'):
        with tag('data'):

            model_string = '<model id="10" name="" sampleRate="{}" start="{}" end="{}" type="sparse" dimensions="2" resolution="1" notifyOnAdd="true" dataset="9" subtype="box" minimum="600" maximum="{}" units="Hz" />'.format(sample_rate, 
                                                                    start_m,
                                                                    end_m,
                                                                    1000)
            doc.asis(model_string)

        with tag('dataset', id='9', dimensions='2'):

            # Read dataframe or other data structure and add the values here
            # These are added as "point" elements, for example:
            # '<point frame="15360" value="3136.87" duration="1724416" extent="2139.22" label="Cape Robin" />'
            for index, row in dataframe.iterrows():

                point  = '<point frame="{}" value="{}" duration="{}" extent="{}" label="{}" />'.format(
                    int(row['frame']), 
                    row['value'],
                    int(row['duration']),
                    1500,
                    row['label'])

                # add the point
                doc.asis(point)
        with tag('display'):

            display_string = '<layer id="2" type="boxes" name="Boxes" model="10"  verticalScale="0"  colourName="White" colour="#ffffff" darkBackground="true" />'
            doc.asis(display_string)

    result = indent(
        doc.getvalue(),
        indentation = ' '*2,
        newline = '\r\n'
    )

    return result

`eso.utils.settings`¶

The typed configuration schema. The JSON passed to ESO(settings_path=...) is validated against these dataclasses. Each top-level section of the file maps to one class. Unknown fields raise a ValueError at load time.

For a narrative walk-through of every field with recommended values from the paper, see Configuration.

BaseConfig `dataclass` ¶

BaseConfig()

dict ¶

dict()

Source code in eso/utils/settings.py

def dict(self):
    return asdict(self)

AlgorithmConfig `dataclass` ¶

AlgorithmConfig(max_generations: int = 100)

Bases: BaseConfig

max_generations `class-attribute` `instance-attribute` ¶

max_generations: int = 100

GeneticOperatorConfig `dataclass` ¶

GeneticOperatorConfig(
    mutation_rate: float = 0.1,
    crossover_rate: float = 0.8,
    reproduction_rate: float = 0.1,
    mutation_height_range: int = 5,
    mutation_position_range: int = 20,
)

Bases: BaseConfig

mutation_rate `class-attribute` `instance-attribute` ¶

mutation_rate: float = 0.1

crossover_rate `class-attribute` `instance-attribute` ¶

crossover_rate: float = 0.8

reproduction_rate `class-attribute` `instance-attribute` ¶

reproduction_rate: float = 0.1

mutation_height_range `class-attribute` `instance-attribute` ¶

mutation_height_range: int = 5

mutation_position_range `class-attribute` `instance-attribute` ¶

mutation_position_range: int = 20

SelectionOperatorConfig `dataclass` ¶

SelectionOperatorConfig(tournament_size: int = 10)

Bases: BaseConfig

tournament_size `class-attribute` `instance-attribute` ¶

tournament_size: int = 10

DataConfig `dataclass` ¶

DataConfig(
    force_recreate_dataset: bool = False,
    keep_in_memory: bool = False,
    species_folder: str = "",
    train_size: float = 0.8,
    test_size: float = 0.2,
    reshuffle: bool = False,
    positive_class: str = "",
    negative_class: str = "",
)

Bases: BaseConfig

force_recreate_dataset `class-attribute` `instance-attribute` ¶

force_recreate_dataset: bool = False

keep_in_memory `class-attribute` `instance-attribute` ¶

keep_in_memory: bool = False

species_folder `class-attribute` `instance-attribute` ¶

species_folder: str = ''

train_size `class-attribute` `instance-attribute` ¶

train_size: float = 0.8

test_size `class-attribute` `instance-attribute` ¶

test_size: float = 0.2

reshuffle `class-attribute` `instance-attribute` ¶

reshuffle: bool = False

positive_class `class-attribute` `instance-attribute` ¶

positive_class: str = ''

negative_class `class-attribute` `instance-attribute` ¶

negative_class: str = ''

PreprocessingConfig `dataclass` ¶

PreprocessingConfig(
    sample_rate: int = 32000,
    lowpass_cutoff: int = 2000,
    downsample_rate: int = 4800,
    nyquist_rate: int = 2400,
    segment_duration: int = 4,
    nb_negative_class: int = 20,
    file_type: str = "svl",
    audio_extension: str = ".wav",
    n_fft: int = 1024,
    hop_length: int = 256,
    n_mels: int = 128,
    f_min: int = 4000,
    f_max: int = 9000,
)

Bases: BaseConfig

sample_rate `class-attribute` `instance-attribute` ¶

sample_rate: int = 32000

lowpass_cutoff `class-attribute` `instance-attribute` ¶

lowpass_cutoff: int = 2000

downsample_rate `class-attribute` `instance-attribute` ¶

downsample_rate: int = 4800

nyquist_rate `class-attribute` `instance-attribute` ¶

nyquist_rate: int = 2400

segment_duration `class-attribute` `instance-attribute` ¶

segment_duration: int = 4

nb_negative_class `class-attribute` `instance-attribute` ¶

nb_negative_class: int = 20

file_type `class-attribute` `instance-attribute` ¶

file_type: str = 'svl'

audio_extension `class-attribute` `instance-attribute` ¶

audio_extension: str = '.wav'

n_fft `class-attribute` `instance-attribute` ¶

n_fft: int = 1024

hop_length `class-attribute` `instance-attribute` ¶

hop_length: int = 256

n_mels `class-attribute` `instance-attribute` ¶

n_mels: int = 128

f_min `class-attribute` `instance-attribute` ¶

f_min: int = 4000

f_max `class-attribute` `instance-attribute` ¶

f_max: int = 9000

PopulationConfig `dataclass` ¶

PopulationConfig(pop_size: int = 10)

Bases: BaseConfig

pop_size `class-attribute` `instance-attribute` ¶

pop_size: int = 10

GeneConfig `dataclass` ¶

GeneConfig(
    min_position: int = 0,
    max_position: int = -1,
    min_height: int = 4,
    max_height: int = 16,
    band_position: int = None,
    band_height: int = None,
    spec_height: int = None,
    minimum_gene_height: int = None,
)

Bases: BaseConfig

min_position `class-attribute` `instance-attribute` ¶

min_position: int = 0

max_position `class-attribute` `instance-attribute` ¶

max_position: int = -1

min_height `class-attribute` `instance-attribute` ¶

min_height: int = 4

max_height `class-attribute` `instance-attribute` ¶

max_height: int = 16

band_position `class-attribute` `instance-attribute` ¶

band_position: int = None

band_height `class-attribute` `instance-attribute` ¶

band_height: int = None

spec_height `class-attribute` `instance-attribute` ¶

spec_height: int = None

minimum_gene_height `class-attribute` `instance-attribute` ¶

minimum_gene_height: int = None

ChromosomeConfig `dataclass` ¶

ChromosomeConfig(
    num_genes: int = None,
    min_num_genes: int = 3,
    max_num_genes: int = 10,
    lambda_1: float = 0.5,
    lambda_2: float = 0.5,
    stack: bool = False,
    baseline_parameters: float = None,
    baseline_metric: int = None,
)

Bases: BaseConfig

num_genes `class-attribute` `instance-attribute` ¶

num_genes: int = None

min_num_genes `class-attribute` `instance-attribute` ¶

min_num_genes: int = 3

max_num_genes `class-attribute` `instance-attribute` ¶

max_num_genes: int = 10

lambda_1 `class-attribute` `instance-attribute` ¶

lambda_1: float = 0.5

lambda_2 `class-attribute` `instance-attribute` ¶

lambda_2: float = 0.5

stack `class-attribute` `instance-attribute` ¶

stack: bool = False

baseline_parameters `class-attribute` `instance-attribute` ¶

baseline_parameters: float = None

baseline_metric `class-attribute` `instance-attribute` ¶

baseline_metric: int = None

ModelConfig `dataclass` ¶

ModelConfig(
    optimizer_name: str = "adam",
    loss_function_name: str = "cross_entropy",
    num_epochs: int = 1,
    batch_size: int = 128,
    learning_rate: float = 0.001,
    shuffle: bool = True,
    metric: str = "f1",
)

Bases: BaseConfig

optimizer_name `class-attribute` `instance-attribute` ¶

optimizer_name: str = 'adam'

loss_function_name `class-attribute` `instance-attribute` ¶

loss_function_name: str = 'cross_entropy'

num_epochs `class-attribute` `instance-attribute` ¶

num_epochs: int = 1

batch_size `class-attribute` `instance-attribute` ¶

batch_size: int = 128

learning_rate `class-attribute` `instance-attribute` ¶

learning_rate: float = 0.001

shuffle `class-attribute` `instance-attribute` ¶

shuffle: bool = True

metric `class-attribute` `instance-attribute` ¶

metric: str = 'f1'

ArchitectureConfig `dataclass` ¶

ArchitectureConfig(
    conv_layers: int = 1,
    conv_filters: int = 8,
    dropout_rate: float = 0.5,
    conv_kernel: int = 8,
    max_pooling_size: int = 4,
    fc_units: int = 32,
    fc_layers: int = 2,
    conv_padding: str = None,
    stride_maxpool: int = None,
)

Bases: BaseConfig

conv_layers `class-attribute` `instance-attribute` ¶

conv_layers: int = 1

conv_filters `class-attribute` `instance-attribute` ¶

conv_filters: int = 8

dropout_rate `class-attribute` `instance-attribute` ¶

dropout_rate: float = 0.5

conv_kernel `class-attribute` `instance-attribute` ¶

conv_kernel: int = 8

max_pooling_size `class-attribute` `instance-attribute` ¶

max_pooling_size: int = 4

fc_units `class-attribute` `instance-attribute` ¶

fc_units: int = 32

fc_layers `class-attribute` `instance-attribute` ¶

fc_layers: int = 2

conv_padding `class-attribute` `instance-attribute` ¶

conv_padding: str = None

stride_maxpool `class-attribute` `instance-attribute` ¶

stride_maxpool: int = None

Config `dataclass` ¶

Config(
    _input: str = None,
    algorithm: AlgorithmConfig = AlgorithmConfig(),
    genetic_operator: GeneticOperatorConfig = GeneticOperatorConfig(),
    selection_operator: SelectionOperatorConfig = SelectionOperatorConfig(),
    data: DataConfig = DataConfig(),
    preprocessing: PreprocessingConfig = PreprocessingConfig(),
    population: PopulationConfig = PopulationConfig(),
    gene: GeneConfig = GeneConfig(),
    chromosome: ChromosomeConfig = ChromosomeConfig(),
    model: ModelConfig = ModelConfig(),
    cnn_architecture: ArchitectureConfig = ArchitectureConfig(),
)

Bases: BaseConfig

algorithm `class-attribute` `instance-attribute` ¶

algorithm: AlgorithmConfig = field(default_factory=AlgorithmConfig)

genetic_operator `class-attribute` `instance-attribute` ¶

genetic_operator: GeneticOperatorConfig = field(default_factory=GeneticOperatorConfig)

selection_operator `class-attribute` `instance-attribute` ¶

selection_operator: SelectionOperatorConfig = field(
    default_factory=SelectionOperatorConfig
)

data `class-attribute` `instance-attribute` ¶

data: DataConfig = field(default_factory=DataConfig)

preprocessing `class-attribute` `instance-attribute` ¶

preprocessing: PreprocessingConfig = field(default_factory=PreprocessingConfig)

population `class-attribute` `instance-attribute` ¶

population: PopulationConfig = field(default_factory=PopulationConfig)

gene `class-attribute` `instance-attribute` ¶

gene: GeneConfig = field(default_factory=GeneConfig)

chromosome `class-attribute` `instance-attribute` ¶

chromosome: ChromosomeConfig = field(default_factory=ChromosomeConfig)

model `class-attribute` `instance-attribute` ¶

model: ModelConfig = field(default_factory=ModelConfig)

cnn_architecture `class-attribute` `instance-attribute` ¶

cnn_architecture: ArchitectureConfig = field(default_factory=ArchitectureConfig)

get_params ¶

get_params()

Source code in eso/utils/settings.py

def get_params(self):
    params = {}
    for key, value in asdict(self).items():
        if key == "_input":
            # params["settings"] = value
            continue
        for sub_key, sub_value in value.items():
            params[f"{key}_{sub_key}"] = sub_value
    return params

`eso.utils.Evaluation`¶

Reproduces the evaluation protocol described in the paper. The class slides a window over each test audio file, applies the model (baseline or ESO chromosome) per window, groups consecutive positive predictions into calling bouts, and computes true positives, false positives, false negatives, and true negatives using a 25 percent overlap rule (10 percent for the Thyolo Alethe dataset). It also measures FLOPs via fvcore, RAM usage via psutil, and energy via CodeCarbon.

Preprocessing ¶

Preprocessing(
    species_folder: str,
    sample_rate: int,
    lowpass_cutoff: int,
    downsample_rate: int,
    nyquist_rate: int,
    segment_duration: int,
    positive_class: str,
    negative_class: str,
    nb_negative_class: int,
    n_fft: int,
    hop_length: int,
    n_mels: int,
    f_min: int,
    f_max: int,
    file_type: str,
    audio_extension: str,
    apply_preprocessing: bool = True,
)

Initialize the Preprocessing object.

Parameters:

Name	Type	Description	Default
`species_folder`	`str`	Path to the species folder containing audio and annotation data.	required
`sample_rate`	`int`	The sample rate for unprocessed audio files.	required
`lowpass_cutoff`	`int`	The cutoff frequency for the low-pass filter.	required
`downsample_rate`	`int`	The rate at which to downsample the audio.	required
`nyquist_rate`	`int`	The Nyquist rate, half of the sampling rate.	required
`segment_duration`	`int`	Duration of each audio segment in seconds.	required
`positive_class`	`str`	Label representing the positive class in the dataset.	required
`negative_class`	`str`	Label representing the negative class in the dataset.	required
`nb_negative_class`	`int`	Number of negative class samples.	required
`n_fft`	`int`	The length of the FFT window for spectrograms.	required
`hop_length`	`int`	The hop length for generating spectrograms.	required
`n_mels`	`int`	The number of mel bands to use in the spectrogram.	required
`f_min`	`int`	The minimum frequency for the mel filter bank.	required
`f_max`	`int`	The maximum frequency for the mel filter bank.	required
`file_type`	`str`	The type of annotation files to process (e.g., '.svl').	required
`audio_extension`	`str`	The file extension for the audio files (e.g., '.wav').	required
`apply_preprocessing`	`bool`	Whether to apply preprocessing steps like filtering and downsampling. Default is True.	`True`

Returns:

Type	Description
`None`

Source code in eso/utils/preprocessing.py

def __init__(
    self,
    species_folder : str,
    sample_rate: int,
    lowpass_cutoff : int,
    downsample_rate : int,
    nyquist_rate : int,
    segment_duration : int,
    positive_class : str,
    negative_class : str,
    nb_negative_class : int,
    n_fft : int,
    hop_length : int,
    n_mels : int,
    f_min : int,
    f_max : int,
    file_type : str,
    audio_extension : str,
    apply_preprocessing: bool=True,

) -> None:
    """
    Initialize the Preprocessing object.

    Parameters
    ----------
    species_folder : str
        Path to the species folder containing audio and annotation data.
    sample_rate : int
        The sample rate for unprocessed audio files.
    lowpass_cutoff : int
        The cutoff frequency for the low-pass filter.
    downsample_rate : int
        The rate at which to downsample the audio.
    nyquist_rate : int
        The Nyquist rate, half of the sampling rate.
    segment_duration : int
        Duration of each audio segment in seconds.
    positive_class : str
        Label representing the positive class in the dataset.
    negative_class : str
        Label representing the negative class in the dataset.
    nb_negative_class : int
        Number of negative class samples.
    n_fft : int
        The length of the FFT window for spectrograms.
    hop_length : int
        The hop length for generating spectrograms.
    n_mels : int
        The number of mel bands to use in the spectrogram.
    f_min : int
        The minimum frequency for the mel filter bank.
    f_max : int
        The maximum frequency for the mel filter bank.
    file_type : str
        The type of annotation files to process (e.g., '.svl').
    audio_extension : str
        The file extension for the audio files (e.g., '.wav').
    apply_preprocessing : bool, optional
        Whether to apply preprocessing steps like filtering and downsampling. Default is True.

    Returns
    -------
    None
    """
    self.sample_rate_unpreprocessed=sample_rate
    self.species_folder = species_folder
    self.lowpass_cutoff = lowpass_cutoff
    self.downsample_rate = downsample_rate
    self.nyquist_rate = nyquist_rate
    self.segment_duration = segment_duration
    self.positive_class = positive_class
    self.negative_class = negative_class
    self.nb_negative_class = nb_negative_class
    self.audio_path = Path(self.species_folder, "Audio")
    self.annotations_path = Path(self.species_folder, "Annotations")
    self.saved_data_path = Path(self.species_folder, "SavedData")
    self.training_files = Path(self.species_folder, "DataFiles", "TrainingFiles.txt")      
    self.n_mels = n_mels
    self.f_min = f_min
    self.f_max = f_max
    self.file_type = file_type
    self.audio_extension = audio_extension
    self.apply_preprocessing = apply_preprocessing
    self.n_fft = n_fft
    self.hop_length = hop_length

sample_rate_unpreprocessed `instance-attribute` ¶

sample_rate_unpreprocessed = sample_rate

species_folder `instance-attribute` ¶

species_folder = species_folder

lowpass_cutoff `instance-attribute` ¶

lowpass_cutoff = lowpass_cutoff

downsample_rate `instance-attribute` ¶

downsample_rate = downsample_rate

nyquist_rate `instance-attribute` ¶

nyquist_rate = nyquist_rate

segment_duration `instance-attribute` ¶

segment_duration = segment_duration

positive_class `instance-attribute` ¶

positive_class = positive_class

negative_class `instance-attribute` ¶

negative_class = negative_class

nb_negative_class `instance-attribute` ¶

nb_negative_class = nb_negative_class

audio_path `instance-attribute` ¶

audio_path = Path(species_folder, 'Audio')

annotations_path `instance-attribute` ¶

annotations_path = Path(species_folder, 'Annotations')

saved_data_path `instance-attribute` ¶

saved_data_path = Path(species_folder, 'SavedData')

training_files `instance-attribute` ¶

training_files = Path(species_folder, 'DataFiles', 'TrainingFiles.txt')

n_mels `instance-attribute` ¶

n_mels = n_mels

f_min `instance-attribute` ¶

f_min = f_min

f_max `instance-attribute` ¶

f_max = f_max

file_type `instance-attribute` ¶

file_type = file_type

audio_extension `instance-attribute` ¶

audio_extension = audio_extension

apply_preprocessing `instance-attribute` ¶

apply_preprocessing = apply_preprocessing

n_fft `instance-attribute` ¶

n_fft = n_fft

hop_length `instance-attribute` ¶

hop_length = hop_length

read_audio_file ¶

read_audio_file(file_name)

Load an audio file and return its waveform and sample rate.

Parameters:

Name	Type	Description	Default
`file_name`	`str`	Name of the audio file including the extension (e.g., "audio1.wav").	required

Returns:

Type	Description
`tuple`	A tuple containing: - np.ndarray: The audio waveform (amplitude values). - int: The sampling rate of the audio file.

Source code in eso/utils/preprocessing.py

def read_audio_file(self, file_name):
    """
    Load an audio file and return its waveform and sample rate.

    Parameters
    ----------
    file_name : str
        Name of the audio file including the extension (e.g., "audio1.wav").

    Returns
    -------
    tuple
        A tuple containing:
        - np.ndarray: The audio waveform (amplitude values).
        - int: The sampling rate of the audio file.
    """
    # Get the path to the file
    audio_folder = Path(file_name)

    # Read the amplitudes and sample rate
    audio_amps, audio_sample_rate = librosa.load(audio_folder, sr=None)

    return audio_amps, audio_sample_rate

butter_lowpass_filter ¶

butter_lowpass_filter(data, cutoff_freq, nyq_freq, order=4)

Apply a Butterworth low-pass filter to the input signal.

This method filters the input signal using a zero-phase Butterworth low-pass filter designed with the specified cutoff and Nyquist frequencies.

Parameters:

Name	Type	Description	Default
`data`	`ndarray`	The input signal (1D array) to be filtered.	required
`cutoff_freq`	`float`	The cutoff frequency of the low-pass filter (in Hz).	required
`nyq_freq`	`float`	The Nyquist frequency (typically half the sampling rate).	required
`order`	`int`	The order of the Butterworth filter. Default is 4.	`4`

Returns:

Type	Description
`ndarray`	The filtered signal with the same shape as the input.

Source code in eso/utils/preprocessing.py

def butter_lowpass_filter(self, data, cutoff_freq, nyq_freq, order=4):
    """
    Apply a Butterworth low-pass filter to the input signal.

    This method filters the input signal using a zero-phase Butterworth low-pass
    filter designed with the specified cutoff and Nyquist frequencies.

    Parameters
    ----------
    data : np.ndarray
        The input signal (1D array) to be filtered.
    cutoff_freq : float
        The cutoff frequency of the low-pass filter (in Hz).
    nyq_freq : float
        The Nyquist frequency (typically half the sampling rate).
    order : int, optional
        The order of the Butterworth filter. Default is 4.

    Returns
    -------
    np.ndarray
        The filtered signal with the same shape as the input.
    """ 
    # Source: https://github.com/guillaume-chevalier/filtering-stft-and-laplace-transform
    b, a = self._butter_lowpass(cutoff_freq, nyq_freq, order=order)
    y = signal.filtfilt(b, a, data)
    return y

downsample_file ¶

downsample_file(amplitudes, original_sr, new_sample_rate)

Downsample an audio waveform to a specified sample rate.

This function resamples the input audio from the original sample rate to a new, lower sample rate using the 'kaiser_fast' resampling method.

Parameters:

Name	Type	Description	Default
`amplitudes`	`ndarray`	The raw audio waveform (1D NumPy array of amplitude values).	required
`original_sr`	`int`	The original sampling rate of the audio signal (in Hz).	required
`new_sample_rate`	`int`	The desired sampling rate to downsample the audio to (in Hz).	required

Returns:

Type	Description
`tuple`	A tuple containing: - np.ndarray: The downsampled audio waveform. - int: The new sampling rate (same as `new_sample_rate`).

Source code in eso/utils/preprocessing.py

def downsample_file(self, amplitudes, original_sr, new_sample_rate):
    """
    Downsample an audio waveform to a specified sample rate.

    This function resamples the input audio from the original sample rate
    to a new, lower sample rate using the 'kaiser_fast' resampling method.

    Parameters
    ----------
    amplitudes : np.ndarray
        The raw audio waveform (1D NumPy array of amplitude values).
    original_sr : int
        The original sampling rate of the audio signal (in Hz).
    new_sample_rate : int
        The desired sampling rate to downsample the audio to (in Hz).

    Returns
    -------
    tuple
        A tuple containing:
        - np.ndarray: The downsampled audio waveform.
        - int: The new sampling rate (same as `new_sample_rate`).
    """
    return (
        librosa.resample(
            amplitudes,
            orig_sr=original_sr,
            target_sr=new_sample_rate,
            res_type="kaiser_fast",
        ),
        new_sample_rate,
    )

convert_single_to_image ¶

convert_single_to_image(audio, sample_rate)

Convert an audio waveform into a normalized mel-spectrogram image.

This function computes the mel-spectrogram from a raw audio signal and applies normalization to scale the spectrogram values between 0 and 1. If preprocessing is enabled, user-defined frequency limits are used; otherwise, default frequency bounds are applied.

Parameters:

Name	Type	Description	Default
`audio`	`ndarray`	The raw audio waveform (1D NumPy array of amplitude values).	required
`sample_rate`	`int`	The sampling rate of the audio signal (in Hz).	required

Returns:

Type	Description
`ndarray`	A 2D NumPy array representing the normalized mel-spectrogram image.

Source code in eso/utils/preprocessing.py

def convert_single_to_image(self, audio, sample_rate):
    """
    Convert an audio waveform into a normalized mel-spectrogram image.

    This function computes the mel-spectrogram from a raw audio signal and 
    applies normalization to scale the spectrogram values between 0 and 1.
    If preprocessing is enabled, user-defined frequency limits are used;
    otherwise, default frequency bounds are applied.

    Parameters
    ----------
    audio : np.ndarray
        The raw audio waveform (1D NumPy array of amplitude values).
    sample_rate : int
        The sampling rate of the audio signal (in Hz).

    Returns
    -------
    np.ndarray
        A 2D NumPy array representing the normalized mel-spectrogram image.
    """
    if not self.apply_preprocessing:
        f_min = 0
        f_max = 5000
    else:
        f_min = self.f_min
        f_max = self.f_max

    S = librosa.feature.melspectrogram(
        y=audio,
        sr=sample_rate,
        n_fft=self.n_fft,
        hop_length=self.hop_length,
        n_mels=self.n_mels,
        fmin=f_min,
        fmax=f_max,
    )


    image = librosa.core.power_to_db(S)
    image_np = np.asmatrix(image)
    image_np_scaled_temp = image_np - np.min(image_np)
    image_np_scaled = image_np_scaled_temp / np.max(image_np_scaled_temp)
    mean = image.flatten().mean()
    std = image.flatten().std()
    eps = 1e-8
    spec_norm = (image - mean) / (std + eps)
    spec_min, spec_max = spec_norm.min(), spec_norm.max()
    spec_scaled = (spec_norm - spec_min) / (spec_max - spec_min)
    S1 = spec_scaled

    return S1

save_data_to_pickle ¶

save_data_to_pickle(X, Y)

Save the input data and labels to pickle files.

This function saves the spectrogram data (X) and their corresponding labels (Y) into separate pickle files (X.pkl and Y.pkl) in the directory specified by self.saved_data_path.

Parameters:

Name	Type	Description	Default
`X`	`any`	The data to be saved (e.g., spectrograms). Must be pickle-serializable.	required
`Y`	`any`	The corresponding labels for `X`. Must also be pickle-serializable.	required

Returns:

Type	Description
`None`

Source code in eso/utils/preprocessing.py

def save_data_to_pickle(self, X, Y):
    """
    Save the input data and labels to pickle files.

    This function saves the spectrogram data (`X`) and their corresponding
    labels (`Y`) into separate pickle files (`X.pkl` and `Y.pkl`) in the directory 
    specified by `self.saved_data_path`.

    Parameters
    ----------
    X : any
        The data to be saved (e.g., spectrograms). Must be pickle-serializable.
    Y : any
        The corresponding labels for `X`. Must also be pickle-serializable.

    Returns
    -------
    None
    """
    outfile = open(Path(self.saved_data_path, "X.pkl"), "wb")
    pickle.dump(X, outfile, protocol=4)
    outfile.close()

    outfile = open(Path(self.saved_data_path, "Y.pkl"), "wb")
    pickle.dump(Y, outfile, protocol=4)
    outfile.close()

load_data_from_pickle ¶

load_data_from_pickle()

Load the data and labels from pickle files.

This function loads spectrogram data (X) and their corresponding labels (Y) from pickle files (X.pkl and Y.pkl) located in the directory specified by self.saved_data_path.

Returns:

Name	Type	Description
`X`	`any`	The loaded data (e.g., spectrograms), as previously saved using `save_data_to_pickle`.
`Y`	`any`	The corresponding labels for `X`.

Source code in eso/utils/preprocessing.py

def load_data_from_pickle(self):
    """
    Load the data and labels from pickle files.

    This function loads spectrogram data (`X`) and their corresponding
    labels (`Y`) from pickle files (`X.pkl` and `Y.pkl`) located in the directory 
    specified by `self.saved_data_path`.

    Returns
    -------
    X : any
        The loaded data (e.g., spectrograms), as previously saved using `save_data_to_pickle`.
    Y : any
        The corresponding labels for `X`.
    """
    infile = open(Path(self.saved_data_path, "X.pkl"), "rb")
    X = pickle.load(infile)
    infile.close()

    infile = open(Path(self.saved_data_path, "Y.pkl"), "rb")
    Y = pickle.load(infile)
    infile.close()

    return X, Y

create_dataset ¶

create_dataset(annotation_folder, sufix_file, file_names=None, augmentation=False)

Create the dataset of audio segments and labels for machine learning.

This function reads audio files and their corresponding annotation files, applies preprocessing (optional low-pass filtering and downsampling), extracts labeled audio segments, and optionally augments the data to balance class distributions.

Parameters:

Name	Type	Description	Default
`annotation_folder`	`str or Path`	Path to the folder containing the `.svl` annotation files.	required
`sufix_file`	`str`	Suffix to append to the annotation filenames for retrieval.	required
`file_names`	`str or Path`	Path to a CSV file containing a list of filenames to process (without extensions). If None, uses `self.training_files`.	`None`
`augmentation`	`bool`	Whether to perform data augmentation to balance the dataset.	`False`

Returns:

Type	Description
`tuple of np.ndarray`	`X_calls` : ndarray of shape (n_samples, ...) Array of preprocessed and optionally augmented audio segments, typically converted into spectrogram images. `Y_calls` : ndarray of shape (n_samples,) Corresponding class labels for each segment (binary or multi-class).

Raises:

Type	Description
`ValueError`	If the `file_names` CSV is missing or empty.

Notes

Annotations are expected in .svl format, created with Sonic Visualiser, using the "boxes area" annotation layer.
Each annotation provides a labeled time segment which is then transformed into a training example.
Augmentation methods include time shifting, noise addition, and mixing with negative samples to improve dataset balance.

Source code in eso/utils/preprocessing.py

def create_dataset(self, annotation_folder, sufix_file, file_names=None, augmentation=False):
    """
    Create the dataset of audio segments and labels for machine learning.

    This function reads audio files and their corresponding annotation files,
    applies preprocessing (optional low-pass filtering and downsampling),
    extracts labeled audio segments, and optionally augments the data to
    balance class distributions.

    Parameters
    ----------
    annotation_folder : str or Path
        Path to the folder containing the `.svl` annotation files.
    sufix_file : str
        Suffix to append to the annotation filenames for retrieval.
    file_names : str or Path, optional
        Path to a CSV file containing a list of filenames to process (without extensions).
        If None, uses `self.training_files`.
    augmentation : bool, optional
        Whether to perform data augmentation to balance the dataset.

    Returns
    -------
    tuple of np.ndarray
        - `X_calls` : ndarray of shape (n_samples, ...)
            Array of preprocessed and optionally augmented audio segments,
            typically converted into spectrogram images.
        - `Y_calls` : ndarray of shape (n_samples,)
            Corresponding class labels for each segment (binary or multi-class).

    Raises
    ------
    ValueError
        If the `file_names` CSV is missing or empty.

    Notes
    -----
    - Annotations are expected in `.svl` format, created with Sonic Visualiser,
    using the "boxes area" annotation layer.
    - Each annotation provides a labeled time segment which is then transformed
    into a training example.
    - Augmentation methods include time shifting, noise addition, and mixing
    with negative samples to improve dataset balance.
    """

    if file_names is None:
        file_names = self.training_files
    # Keep track of how many calls were found in the annotation files
    total_calls = 0

    # Initialise lists to store the X and Y values
    X_calls = []
    Y_calls = []

    # Read all names of the files
    try:
        files = pd.read_csv(file_names, header=None)
    except Exception:
        raise ValueError(
            f"Error loading filenames from {file_names}. Check if File is not empty."
        )
    # Iterate over each annotation file
    for file in files.values:
        file = file[0]

        file_name_no_extension = file

        reader = AnnotationReader(self.species_folder,file, self.file_type, self.audio_extension, self.positive_class
        )
        # Check if the audio file exists before processing
        if str(
            Path(self.audio_path, file_name_no_extension + self.audio_extension)
        ) in glob(str(self.audio_path / f"*{self.audio_extension}")):

            # Read audio file
            audio_amps, original_sample_rate = self.read_audio_file(
                str(
                    Path(
                        self.audio_path,
                        file_name_no_extension + self.audio_extension,
                    )
                )
            )

            if self.apply_preprocessing:
                # Low pass filter
                filtered = self.butter_lowpass_filter(
                    audio_amps, self.lowpass_cutoff, self.nyquist_rate
                )
                # Downsample
                amplitudes, sample_rate = self.downsample_file(
                    filtered, original_sample_rate, self.downsample_rate
                )
                del filtered

            else:

                if original_sample_rate!=self.sample_rate_unpreprocessed: 
                    amplitudes, sample_rate = self.downsample_file(
                    audio_amps, original_sample_rate, self.sample_rate_unpreprocessed
                )
                else :
                    amplitudes, sample_rate = audio_amps, original_sample_rate

            del audio_amps
            df, audio_file_name = reader.get_annotation_information(annotation_folder, sufix_file)


            for index, row in df.iterrows():
                start_seconds = int(round(row["Start"]))
                end_seconds = int(round(row["End"]))
                label = row["Label"]
                annotation_duration_seconds = end_seconds - start_seconds

                # Extract augmented audio segments and corresponding binary labels
                X_data, y_data = self._getXY(
                    amplitudes,
                    sample_rate,
                    start_seconds,
                    annotation_duration_seconds,
                    label
                )

                # Append the segments and labels
                X_calls.extend(X_data)
                Y_calls.extend(y_data)



    if augmentation:
        # Augment dataset to get a balance dataset
        X_calls, Y_calls = self._augment_dataset(X_calls, Y_calls)


    X_calls = self._convert_all_to_image(X_calls, sample_rate)

    # Convert to numpy arrays
    X_calls, Y_calls = np.asarray(X_calls), np.asarray(Y_calls)

    return X_calls, Y_calls

shuffle_files_names ¶

shuffle_files_names(train_size=0.8, test_size=0.1, validation_size=0.1)

Shuffle audio file names and split them into training, testing, and validation sets.

This method scans the Audio folder inside the species directory for all files with the specified audio extension. It then randomly shuffles and splits the file names into training, testing, and validation sets according to the specified proportions. The resulting file names (without extensions) are saved as text files (train.txt, test.txt, validation.txt) inside the DataFiles subdirectory of the species folder.

Parameters:

Name	Type	Description	Default
`train_size`	`float`	Proportion of files to use for training. Default is 0.8.	`0.8`
`test_size`	`float`	Proportion of files to use for testing. Default is 0.1.	`0.1`
`validation_size`	`float`	Proportion of files to use for validation. Default is 0.1.	`0.1`

Raises:

Type	Description
`Exception`	If no audio files are found in the specified audio directory.

Notes

The sum of train_size, test_size, and validation_size should be 1.0.
Output files are saved as plain text, with one file name (without extension) per line.
The audio extension is read from self.audio_extension, and the species folder from self.species_folder.

Source code in eso/utils/preprocessing.py

def shuffle_files_names(self, train_size=0.8, test_size=0.1, validation_size=0.1):
    """
    Shuffle audio file names and split them into training, testing, and validation sets.

    This method scans the `Audio` folder inside the species directory for all
    files with the specified audio extension. It then randomly shuffles and splits
    the file names into training, testing, and validation sets according to the 
    specified proportions. The resulting file names (without extensions) are saved
    as text files (`train.txt`, `test.txt`, `validation.txt`) inside the `DataFiles`
    subdirectory of the species folder.

    Parameters
    ----------
    train_size : float, optional
        Proportion of files to use for training. Default is 0.8.
    test_size : float, optional
        Proportion of files to use for testing. Default is 0.1.
    validation_size : float, optional
        Proportion of files to use for validation. Default is 0.1.

    Raises
    ------
    Exception
        If no audio files are found in the specified audio directory.

    Notes
    -----
    - The sum of `train_size`, `test_size`, and `validation_size` should be 1.0.
    - Output files are saved as plain text, with one file name (without extension) per line.
    - The audio extension is read from `self.audio_extension`, and the species folder
    from `self.species_folder`.
    """        
    # Get all file names in Audio folder
    path = Path(self.species_folder, "Audio", f"*{self.audio_extension}")
    files = glob(str(path))

    if len(files) == 0:
        raise Exception(
            f"No audio files found in {self.species_folder}/Audio.\
            Please check the audio_extension setting in the settings file."
        )
    # Shuffle the files
    np.random.shuffle(files)

    train_samples = int(np.floor(len(files) * train_size))
    test_samples = int(np.floor(len(files) * test_size))

    # Split the files into train, test, validation
    train_split = train_samples
    test_split = test_samples

    train_files = files[:train_split]
    test_files = files[train_split : train_split + test_split]
    # Use the rest for validation
    validation_files = files[train_split + test_split :]

    # Only get the file names
    train_files = [os.path.basename(file) for file in train_files]
    test_files = [os.path.basename(file) for file in test_files]
    validation_files = [os.path.basename(file) for file in validation_files]

    # Remove the file extension
    train_files = [os.path.splitext(file)[0] for file in train_files]
    test_files = [os.path.splitext(file)[0] for file in test_files]
    validation_files = [os.path.splitext(file)[0] for file in validation_files]

    # Create the folders
    os.makedirs(Path(self.species_folder, "DataFiles"), exist_ok=True)

    # Save the files as .txt
    with open(Path(self.species_folder, "DataFiles", "train.txt"), "w") as f:
        f.write("\n".join(train_files))
    with open(os.path.join(self.species_folder, "DataFiles", "test.txt"), "w") as f:
        f.write("\n".join(test_files))

    with open(Path(self.species_folder, "DataFiles", "validation.txt"), "w") as f:
        f.write("\n".join(validation_files))

check_distribution ¶

check_distribution(Y)

Source code in eso/utils/preprocessing.py

def check_distribution(self, Y):
    unique, counts = np.unique(Y, return_counts=True)
    original_distribution = dict(zip(unique, counts))
    return original_distribution

AnnotationReader ¶

AnnotationReader(
    path: str,
    annotation_file_name: str,
    file_type: str,
    audio_extension: str,
    positive_class: str,
)

Source code in eso/utils/AnnotationReader.py

def __init__(
    self, 
    path : str, 
    annotation_file_name : str, 
    file_type : str, 
    audio_extension : str, 
    positive_class: str):


    self.path = path
    self.annotation_file_name = annotation_file_name
    self.file_type = file_type
    self.audio_extension = audio_extension
    self.positive_class=positive_class
    """
    Initializes the AnnotationReader class.

    Parameters
    ----------
    path : str
        The path to the directory containing the annotation and audio files.
    annotation_file_name : str
        The name of the annotation file (without extension) to be read.
    file_type : str
        The type of annotation file (e.g., "svl", "xml").
    audio_extension : str
        The file extension for the associated audio files (e.g., ".wav", ".mp3").
    positive_class : str
        The label representing the positive class in classification tasks.

    Returns
    -------
    None
    """

path `instance-attribute` ¶

path = path

annotation_file_name `instance-attribute` ¶

annotation_file_name = annotation_file_name

file_type `instance-attribute` ¶

file_type = file_type

audio_extension `instance-attribute` ¶

audio_extension = audio_extension

positive_class `instance-attribute` ¶

positive_class = positive_class

Initializes the AnnotationReader class.

Parameters:

Name	Type	Description	Default
`path`	`str`	The path to the directory containing the annotation and audio files.	required
`annotation_file_name`	`str`	The name of the annotation file (without extension) to be read.	required
`file_type`	`str`	The type of annotation file (e.g., "svl", "xml").	required
`audio_extension`	`str`	The file extension for the associated audio files (e.g., ".wav", ".mp3").	required
`positive_class`	`str`	The label representing the positive class in classification tasks.	required

Returns:

Type	Description
`None`

get_annotation_information ¶

get_annotation_information(annotation_folder, sufix_file)

Extract annotation information from an .svl XML file and return a DataFrame with start times, end times, and labels for the annotations.

This method parses an XML annotation file (.svl format) to extract annotation details including the start time, end time, and label for each annotation. It processes the XML file, handles any confidence values, and adjusts labels accordingly (e.g., using the positive class label for predicted annotations).

Parameters:

Name	Type	Description	Default
`annotation_folder`	`str`	The folder where the annotation file is located.	required
`sufix_file`	`str`	The suffix to append to the base annotation file name to get the full file name.	required

Returns:

Type	Description
`tuple`	A tuple containing: - pd.DataFrame: A DataFrame with three columns: - 'Start': The start time of the annotation in seconds. - 'End': The end time of the annotation in seconds. - 'Label': The label associated with the annotation. - str: The name of the corresponding audio file (with ".wav" extension).

Raises:

Type	Description
`Exception`	If the annotation file does not contain valid annotation information.

Source code in eso/utils/AnnotationReader.py

def get_annotation_information(self, annotation_folder, sufix_file ):
    """
    Extract annotation information from an `.svl` XML file and return a DataFrame
    with start times, end times, and labels for the annotations.

    This method parses an XML annotation file (`.svl` format) to extract annotation
    details including the start time, end time, and label for each annotation.
    It processes the XML file, handles any confidence values, and adjusts labels
    accordingly (e.g., using the positive class label for predicted annotations).

    Parameters
    ----------
    annotation_folder : str
        The folder where the annotation file is located.
    sufix_file : str
        The suffix to append to the base annotation file name to get the full file name.

    Returns
    -------
    tuple
        A tuple containing:
        - pd.DataFrame: A DataFrame with three columns:
            - 'Start': The start time of the annotation in seconds.
            - 'End': The end time of the annotation in seconds.
            - 'Label': The label associated with the annotation.
        - str: The name of the corresponding audio file (with ".wav" extension).

    Raises
    ------
    Exception
        If the annotation file does not contain valid annotation information.
    """

    path = str(Path(
            self.path, annotation_folder, self.annotation_file_name + sufix_file
        ))


    xmldoc = minidom.parse(path)
    itemlist = xmldoc.getElementsByTagName("point")
    idlist = xmldoc.getElementsByTagName("model")

    start_time = []
    end_time = []
    labels = []
    audio_file_name = ""

    if len(idlist) > 0:
        for s in idlist: 
            original_sample_rate = int(s.attributes["sampleRate"].value)


    if len(itemlist) > 0:

        # Iterate over each annotation in the .svl file (annotatation file)
        for s in itemlist:
            # Get the starting seconds from the annotation file. Must be an integer
            # so that the correct frame from the waveform can be extracted
            start_seconds = (
                    float(s.attributes["frame"].value) / original_sample_rate
                )

            # Get the label from the annotation file
            label = str(s.attributes["label"].value)

            # Set the default confidence to 10 (i.e. high confidence that
            # the label is correct). Annotations that do not have the idea
            # of 'confidence' are teated like normal annotations and it is
            # assumed that the annotation is correct (by the annotator).
            label_confidence = 10

            # Check if a confidence has been assigned
            if "," in label:
                # Extract the raw label
                lalel_string = label[: label.find(",") :]

                # Extract confidence value
                label_confidence = int(label[label.find(",") + 1 :])

                # Set the label to the raw label
                label = lalel_string

                # If a file has a blank label then skip this annotation
                # to avoid mislabelling data
            if label == "":
                break


            #to include predictions obtained from a model
            if label == "predicted" :
                label=self.positive_class

            # Only considered cases where the labels are very confident
            # 10 = very confident, 5 = medium, 1 = unsure this is represented
            # as "SPECIES:10", "SPECIES:5" when annotating.
            if label_confidence == 10:
                # Get the duration from the annotation file
                annotation_duration_seconds = (
                        float(s.attributes["duration"].value) / original_sample_rate
                    )
                start_time.append(start_seconds)
                end_time.append(start_seconds + annotation_duration_seconds)
                labels.append(label)

    df_svl_gibbons = pd.DataFrame(
            {"Start": start_time, "End": end_time, "Label": labels}
        )
    return df_svl_gibbons, self.annotation_file_name + ".wav"

get_annotation_information_testing ¶

get_annotation_information_testing()

Extract annotation information from a .svl XML file and return a DataFrame with frame, value, duration, extent, and label for each annotation.

This method parses an XML annotation file (.svl format) to extract detailed annotation information such as frame number, value, duration, extent, and label. It also extracts the sample rate, start time, and end time from the file's metadata.

Parameters:

Name	Type	Description	Default
`None`			required

Returns:

Type Description

tuple

A tuple containing: - pd.DataFrame: A DataFrame with columns: - 'frame': The frame number from the annotation. - 'value': The value associated with the annotation. - 'duration': The duration of the annotation. - 'extent': The extent of the annotation. - 'label': The label associated with the annotation. - int: The sample rate extracted from the .svl file. - str: The start time of the annotation in the .svl file. - str: The end time of the annotation in the .svl file.

Raises:

Type	Description
`Exception`	If the annotation file is not found or if it does not contain valid annotation information.

Source code in eso/utils/AnnotationReader.py

def get_annotation_information_testing(self):
    """
    Extract annotation information from a `.svl` XML file and return a DataFrame
    with frame, value, duration, extent, and label for each annotation.

    This method parses an XML annotation file (`.svl` format) to extract detailed
    annotation information such as frame number, value, duration, extent, and label.
    It also extracts the sample rate, start time, and end time from the file's metadata.

    Parameters
    ----------
    None

    Returns
    -------
    tuple
        A tuple containing:
        - pd.DataFrame: A DataFrame with columns:
            - 'frame': The frame number from the annotation.
            - 'value': The value associated with the annotation.
            - 'duration': The duration of the annotation.
            - 'extent': The extent of the annotation.
            - 'label': The label associated with the annotation.
        - int: The sample rate extracted from the `.svl` file.
        - str: The start time of the annotation in the `.svl` file.
        - str: The end time of the annotation in the `.svl` file.

    Raises
    ------
    Exception
        If the annotation file is not found or if it does not contain valid annotation information.
    """

    path = os.path.join(
            self.path, "Annotations", self.annotation_file_name + ".svl"
        )

    # Process the .svl xml file
    xmldoc = minidom.parse(path)
    itemlist = xmldoc.getElementsByTagName('point')
    idlist = xmldoc.getElementsByTagName('model')

    sampleRate = idlist.item(0).attributes['sampleRate'].value 
    start_m = idlist.item(0).attributes['start'].value
    end_m = idlist.item(0).attributes['end'].value


    values = []
    frames = []
    durations=[]
    extents=[]
    labels = []
    audio_file_name = ''

    if len(idlist) > 0:
        for s in idlist: 
            original_sample_rate = int(s.attributes["sampleRate"].value)

    if (len(itemlist) > 0):

    # Iterate over each annotation in the .svl file (annotatation file)
        for s in itemlist:

            # Get the starting seconds from the annotation file. Must be an integer
            # so that the correct frame from the waveform can be extracted
            frame = float(s.attributes['frame'].value)
            value = float(s.attributes['value'].value)
            duration = float(s.attributes['duration'].value)
            extent = float(s.attributes['extent'].value)
            label = str(s.attributes['label'].value)

            # Set the default confidence to 10 (i.e. high confidence that
            # the label is correct). Annotations that do not have the idea
            # of 'confidence' are teated like normal annotations and it is
            # assumed that the annotation is correct (by the annotator). 
            label_confidence = 10

            # Check if a confidence has been assigned
            if ',' in label:

                # Extract the raw label
                lalel_string = label[:label.find(','):]

                # Extract confidence value
                label_confidence = int(label[label.find(',')+1:])

                # Set the label to the raw label
                label = lalel_string


            # If a file has a blank label then skip this annotation
            # to avoid mislabelling data
            if label == '':
                break

            # Only considered cases where the labels are very confident
            # 10 = very confident, 5 = medium, 1 = unsure this is represented
            # as "SPECIES:10", "SPECIES:5" when annotating.
            if label_confidence == 10:

                frames.append(frame)
                values.append(value)
                durations.append(duration)
                extents.append(extent)
                labels.append(label)

    df_svl_gibbons = pd.DataFrame({'frame': frames, 'value':values ,'duration': durations,
                              'extent':extents,'label':labels})
    return df_svl_gibbons, sampleRate, start_m, end_m

dataframe_to_svl ¶

dataframe_to_svl(dataframe, sample_rate, start_m, end_m)

Convert a DataFrame of annotations to a .svl format XML string.

This method generates a .svl format XML string containing the annotations from a DataFrame. The generated XML includes metadata such as the sample rate, start time, end time, and annotation points (frame, value, duration, extent, and label).

Parameters:

Name	Type	Description	Default
`dataframe`	`DataFrame`	A DataFrame containing the annotation information. The DataFrame should have the following columns: 'frame', 'value', 'duration', 'extent', and 'label'.	required
`sample_rate`	`int`	The sample rate of the audio associated with the annotations.	required
`start_m`	`str`	The start time (in seconds) of the annotation period.	required
`end_m`	`str`	The end time (in seconds) of the annotation period.	required

Returns:

Type	Description
`str`	A string containing the XML in `.svl` format, representing the annotations along with metadata.

Notes

The function generates an XML document that includes: - <model>: metadata about the annotation model, including sample rate, start time, and end time. - <dataset>: contains <point> elements that represent individual annotations. - <display>: defines the display settings for the annotation in the software.

Source code in eso/utils/AnnotationReader.py

def dataframe_to_svl(self, dataframe, sample_rate, start_m, end_m):
    """
    Convert a DataFrame of annotations to a `.svl` format XML string.

    This method generates a `.svl` format XML string containing the annotations
    from a DataFrame. The generated XML includes metadata such as the sample rate,
    start time, end time, and annotation points (frame, value, duration, extent, and label).

    Parameters
    ----------
    dataframe : pd.DataFrame
        A DataFrame containing the annotation information. The DataFrame should have 
        the following columns: 'frame', 'value', 'duration', 'extent', and 'label'.
    sample_rate : int
        The sample rate of the audio associated with the annotations.
    start_m : str
        The start time (in seconds) of the annotation period.
    end_m : str
        The end time (in seconds) of the annotation period.

    Returns
    -------
    str
        A string containing the XML in `.svl` format, representing the annotations
        along with metadata.

    Notes
    -----
    The function generates an XML document that includes:
    - `<model>`: metadata about the annotation model, including sample rate, start time, and end time.
    - `<dataset>`: contains `<point>` elements that represent individual annotations.
    - `<display>`: defines the display settings for the annotation in the software.
    """
    doc, tag, text = Doc().tagtext()
    doc.asis('<?xml version="1.0" encoding="UTF-8"?>')
    doc.asis('<!DOCTYPE sonic-visualiser>')

    with tag('sv'):
        with tag('data'):

            model_string = '<model id="10" name="" sampleRate="{}" start="{}" end="{}" type="sparse" dimensions="2" resolution="1" notifyOnAdd="true" dataset="9" subtype="box" minimum="600" maximum="{}" units="Hz" />'.format(sample_rate, 
                                                                    start_m,
                                                                    end_m,
                                                                    1000)
            doc.asis(model_string)

        with tag('dataset', id='9', dimensions='2'):

            # Read dataframe or other data structure and add the values here
            # These are added as "point" elements, for example:
            # '<point frame="15360" value="3136.87" duration="1724416" extent="2139.22" label="Cape Robin" />'
            for index, row in dataframe.iterrows():

                point  = '<point frame="{}" value="{}" duration="{}" extent="{}" label="{}" />'.format(
                    int(row['frame']), 
                    row['value'],
                    int(row['duration']),
                    1500,
                    row['label'])

                # add the point
                doc.asis(point)
        with tag('display'):

            display_string = '<layer id="2" type="boxes" name="Boxes" model="10"  verticalScale="0"  colourName="White" colour="#ffffff" darkBackground="true" />'
            doc.asis(display_string)

    result = indent(
        doc.getvalue(),
        indentation = ' '*2,
        newline = '\r\n'
    )

    return result

CPU_Unpickler ¶

Bases: Unpickler

find_class ¶

find_class(module, name)

Source code in eso/utils/Evaluation.py

def find_class(self, module, name):
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    if module == "torch.storage" and name == "_load_from_bytes":
        return lambda b: torch.load(io.BytesIO(b), map_location=device)
    else:
        return super().find_class(module, name)

Evaluation ¶

Evaluation(
    species_folder: str,
    settings,
    overlap=0.25,
    nb_to_group=2,
    threshold=0.8,
    chromosome=None,
    apply_preprocessing: bool = True,
    force_calc_spectrograms: bool = False,
    logger=None,
    log_path=None,
    log_level=0,
    save_folder: str = "Predictions",
)

Source code in eso/utils/Evaluation.py

def __init__(
    self,
    species_folder: str,
    settings,
    overlap=0.25,
    nb_to_group=2,
    threshold=0.8,
    chromosome=None,
    apply_preprocessing: bool = True,
    force_calc_spectrograms: bool = False,
    logger=None,
    log_path=None,
    log_level=0,
    save_folder: str = "Predictions",
) -> None:


    if logger==None : 
        self.logger = setup_logger(
            logger=logger, log_path=log_path, log_level=log_level) 
    else : 
        self.logger=logger


    self.species_folder = species_folder
    __preprocessing_name = "preprocessed" if apply_preprocessing else "unpreprocessed"
    self.saved_data_folder = Path(species_folder, "SavedData", __preprocessing_name)


    self.apply_preprocessing_flag = apply_preprocessing
    self.config = settings
    self.segment_duration=self.config.preprocessing.dict()["segment_duration"]
    self.positive_class = self.config.data.dict()["positive_class"]
    self.negative_class = self.config.data.dict()["negative_class"]

    self.overlap=overlap
    self.nb_to_group=nb_to_group
    self.threshold=threshold
    self.sampling_rate_origin=self.config.preprocessing.sample_rate

    self.chromosome = chromosome
    self.force_calc_spectrograms = force_calc_spectrograms


    if self.chromosome == None:
        self.save_folder_predictions = save_folder + "_baseline"
        self.save_folder_spectrograms =  "Saved_spectrograms_baseline"
    else:
        self.save_folder_predictions = save_folder + "_chromosome"
        self.save_folder_spectrograms =  "Saved_spectrograms_chromosome"

    self.save_results = Path(self.species_folder, self.save_folder_predictions)
    self.save_spectrograms_path=Path(self.species_folder, self.save_folder_spectrograms)

    self.prep = Preprocessing(
        **self.config.preprocessing.dict(),
        positive_class=self.positive_class,
        negative_class=self.negative_class,
        apply_preprocessing=self.apply_preprocessing_flag,
        species_folder=self.species_folder,
    )

logger `instance-attribute` ¶

logger = setup_logger(logger=logger, log_path=log_path, log_level=log_level)

species_folder `instance-attribute` ¶

species_folder = species_folder

saved_data_folder `instance-attribute` ¶

saved_data_folder = Path(species_folder, 'SavedData', __preprocessing_name)

apply_preprocessing_flag `instance-attribute` ¶

apply_preprocessing_flag = apply_preprocessing

config `instance-attribute` ¶

config = settings

segment_duration `instance-attribute` ¶

segment_duration = dict()['segment_duration']

positive_class `instance-attribute` ¶

positive_class = dict()['positive_class']

negative_class `instance-attribute` ¶

negative_class = dict()['negative_class']

overlap `instance-attribute` ¶

overlap = overlap

nb_to_group `instance-attribute` ¶

nb_to_group = nb_to_group

threshold `instance-attribute` ¶

threshold = threshold

sampling_rate_origin `instance-attribute` ¶

sampling_rate_origin = sample_rate

chromosome `instance-attribute` ¶

chromosome = chromosome

force_calc_spectrograms `instance-attribute` ¶

force_calc_spectrograms = force_calc_spectrograms

save_folder_predictions `instance-attribute` ¶

save_folder_predictions = save_folder + '_baseline'

save_folder_spectrograms `instance-attribute` ¶

save_folder_spectrograms = 'Saved_spectrograms_baseline'

save_results `instance-attribute` ¶

save_results = Path(species_folder, save_folder_predictions)

save_spectrograms_path `instance-attribute` ¶

save_spectrograms_path = Path(species_folder, save_folder_spectrograms)

prep `instance-attribute` ¶

prep = Preprocessing(
    **(dict()),
    positive_class=positive_class,
    negative_class=negative_class,
    apply_preprocessing=apply_preprocessing_flag,
    species_folder=species_folder
)

prediction_files ¶

prediction_files(model, data_type='test')

Source code in eso/utils/Evaluation.py

def prediction_files(self, model, data_type = "test"):


    test_path = Path(self.species_folder, "DataFiles", data_type + ".txt")
    #test_path = Path(self.species_folder, "DataFiles", "test.txt")
    file_names = pd.read_csv(test_path, header=None)

    for file in file_names.values:
        file = file[0]

        self.logger.info(f"Processing file: {file}")
        self._process_one_file(file, model, verbose=True)

comparison_predictions_annotations ¶

comparison_predictions_annotations(folder, data_type='test')

Source code in eso/utils/Evaluation.py

def comparison_predictions_annotations(self, folder, data_type="test"):
    self.logger.info("comparing prediction and annotation")
    test_path = Path(self.species_folder, "DataFiles", data_type + ".txt")
    #test_path = Path(self.species_folder, "DataFiles", "test.txt")
    file_names = pd.read_csv(test_path, header=None)



    predictions = []
    annotations = []

    # check if corrected annotations for the testing files have been done
    if os.path.exists(Path(self.species_folder, "Annotations_corrected")):
        self.logger.info(
            "the corrected annotations of the testing dataset have already been created "
        )

    else:
        self.logger.info(
            "Need to modify the annotations of the testing dataset to allow a correct evaluation of the model "
        )
        self._repair_svl(
            file_names,
            self.prep.file_type,
            self.prep.audio_extension,
            annotation_folder="Annotations",
            sufix_file=".svl",
        )
    for file in file_names.values:
        file = file[0]

        reader = AnnotationReader(
            self.species_folder,
            file,
            self.prep.file_type,
            self.prep.audio_extension,
            self.positive_class,
        )

        svl = reader.get_annotation_information(
            annotation_folder="Annotations_corrected", sufix_file="_repaired.svl"
        )[0]
        svl["Overlap"] = 0.0
        svl["Cat"] = "TN"
        svl.loc[svl.Label == self.positive_class, "Cat"] = "FN"
        svl["Index"] = np.nan
        svl["Nb overlap"] = 0
        svl["Name"] = file

        if os.path.exists(
            Path(self.species_folder, folder, file + "_predictions.svl")
        ):
            self.logger.info(f"Found Prediction: {file} ")
            predict = reader.get_annotation_information(
                annotation_folder=folder, sufix_file="_predictions.svl"
            )[0]

            predict["Overlap"] = 0.0
            predict["Cat"] = "FP"
            predict["Index"] = np.nan
            predict["Nb overlap"] = 0
            predict["Name"] = file

            # compare predictions vs annotations
            if svl[svl.Label == self.positive_class].shape[0] != 0:
                for index, row in predict.iterrows():
                    idx = np.abs(
                        np.asarray(
                            svl[svl.Label == self.positive_class]["Start"]
                        )
                        - row.iloc[0]
                    ).argmin()  # get the closest window
                    lap = self._overlap(
                        row.iloc[0],
                        row.iloc[1],
                        svl[svl.Label == self.positive_class].iloc[idx, 0],
                        svl[svl.Label == self.positive_class].iloc[idx, 1],
                    )  # check overlap

                    if lap > self.overlap * self.segment_duration :
                        predict.loc[index, "Overlap"] = deepcopy(lap)
                        predict.loc[index, "Cat"] = "TP"
                        predict.loc[index, "Index"] = idx
                    else:
                        predict.loc[index, "Overlap"] = deepcopy(lap)

                for index, row in predict.iterrows():
                    w = 0
                    for idx_svl, row_svl in svl[
                        svl.Label == self.positive_class
                    ].iterrows():
                        lap = self._overlap(
                            row.iloc[0],
                            row.iloc[1],
                            row_svl.iloc[0],
                            row_svl.iloc[1],
                        )
                        if lap > self.overlap * self.segment_duration :
                            w += 1
                    predict.loc[index, "Nb overlap"] = w
            else:
                self.logger.info("No positive class in the annotation file")
            predictions.append(predict)

            # compare annotations vs predictions
            for index, row in svl.iterrows():
                idx = np.abs(
                    np.asarray(predict["Start"]) - row.iloc[0]
                ).argmin()  # get the closest window
                lap = self._overlap(
                    row.iloc[0],
                    row.iloc[1],
                    predict.iloc[idx, 0],
                    predict.iloc[idx, 1],
                )  # check overlap

                if (lap > self.overlap * self.segment_duration) & (
                    svl.loc[index, "Label"] == self.positive_class
                ):
                    svl.loc[index, "Overlap"] = deepcopy(lap)
                    svl.loc[index, "Index"] = idx
                    svl.loc[index, "Cat"] = "TP"
                elif (lap > self.overlap * self.segment_duration) & (
                    svl.loc[index, "Label"] == self.negative_class
                ):
                    svl.loc[index, "Overlap"] = deepcopy(lap)
                    svl.loc[index, "Index"] = idx
                    svl.loc[index, "Cat"] = "FP"
                else:
                    svl.loc[index, "Overlap"] = deepcopy(lap)

            # Print File and FP TP FN
            self.logger.info("-------------")
            self.logger.info(file)
            self.logger.info(f"FP : {predict[predict.Cat == 'FP'].shape[0]}")
            self.logger.info(f"TP : {svl[svl.Cat == 'TP'].shape[0]} ")
            self.logger.info(f"FN : {svl[svl.Cat == 'FN'].shape[0]} ")
            self.logger.info("-------------")

            for index, row in svl.iterrows():
                w = 0
                for idx_pred, row_pred in predict.iterrows():
                    lap = self._overlap(
                        row.iloc[0], row.iloc[1], row_pred.iloc[0], row_pred.iloc[1]
                    )
                    if lap > self.overlap * self.segment_duration:
                        w += 1
                svl.loc[index, "Nb overlap"] = w

        annotations.append(svl)

    Predictions = pd.DataFrame(np.concatenate(predictions, axis=0))
    Predictions.columns = predict.columns
    Predictions.Index = Predictions.Index.astype(float)

    Annotations = pd.DataFrame(np.concatenate(annotations, axis=0))
    Annotations.columns = svl.columns
    Annotations.Index = Annotations.Index.astype(float)

    return Predictions, Annotations

testing_score ¶

testing_score(Annotations, Predictions)

Source code in eso/utils/Evaluation.py

def testing_score(self, Annotations, Predictions):


    cat, count = np.unique(Predictions["Cat"], return_counts=True)
    cat_a, count_a = np.unique(Annotations["Cat"], return_counts=True)


    FP = count[cat == "FP"][0] if len(count[cat == "FP"]) > 0 else 0
    TP = count_a[cat_a == "TP"][0] if len(count_a[cat_a == "TP"]) > 0 else 0
    FN = count_a[cat_a == "FN"][0] if len(count_a[cat_a == "FN"]) > 0 else 0
    TN = count_a[cat_a == "TN"][0] if len(count_a[cat_a == "TN"]) > 0 else 0

    F_score = TP / (TP + ((FN + FP) / 2))
    Accuracy = (TP + TN) / (TP + TN + FP + FN)
    confusion=np.array([[TP, FP], [FN, TN]])

    self.logger.info(
        f"Number of calls to detect :{Annotations[Annotations.Label == self.positive_class].shape[0]}")
    self.logger.info(f"False Positif :  {FP}")
    self.logger.info(f"True Positif :{TP} ")
    self.logger.info(f"False Negatif : {FN}" )
    self.logger.info(f"F1-score : {F_score}")
    self.logger.info(f"Accuracy : {Accuracy}")

    return F_score, Accuracy, confusion

run ¶

run(model, data_type='test', test_type='simple')

Source code in eso/utils/Evaluation.py

def run(self, model, data_type="test", test_type = "simple"):
    if test_type == "simple":
        return self._presegmented_dataset_run(model, data_type=data_type)

    else : 
        return self._entire_files_run(model, data_type=data_type)

plot_chromosome ¶

plot_chromosome(
    chromosome, image_height, title, results_path=None, name="current_best_chromosome"
)

Source code in eso/utils/logger.py

def plot_chromosome(
    chromosome, image_height, title, results_path=None, name="current_best_chromosome"
):
    plt.figure(figsize=(4.5, 4.5))
    for gene in chromosome.get_genes():
        position = gene.get_band_position()
        height = gene.get_band_height()

        # Create a horizontal span
        plt.axhspan(position, position + height, alpha=0.5)
        plt.ylim(0, image_height)

    plt.gca().invert_yaxis()
    rounded_fitness = round(chromosome.get_fitness(), 4)
    rounded_metric = round(chromosome.get_metric(), 4)
    plt.title("Fitness: " + str(rounded_fitness))
    plt.suptitle(
        title
        + ": "
        + str(rounded_metric)
        + ";Parameters:"
        + str(chromosome.get_trainable_parameters())
    )
    plt.tight_layout()
    if results_path is not None:
        plt.gcf().savefig(os.path.join(results_path, f"{name}.png"))
    return plt.gcf()

log_tensorboard ¶

log_tensorboard(
    best_chromosome,
    epoch,
    writer,
    tensorboard_log_dir,
    image_height,
    metric_name,
    results_path=None,
)

Source code in eso/utils/logger.py

def log_tensorboard(
    best_chromosome,
    epoch,
    writer,
    tensorboard_log_dir,
    image_height,
    metric_name,
    results_path=None,
):
    if tensorboard_log_dir is None:
        return

    if metric_name == "f1":
        suptitle_name = "F1-Score"
    else:
        suptitle_name = metric_name.capitalize()

    best_chromosome_fitness = best_chromosome.get_fitness()
    writer.add_scalar("Best Chromosome Fitness", best_chromosome_fitness, epoch)

    writer.add_scalar(
        "Best Chromosome Number of Bands", best_chromosome.num_genes, epoch
    )
    writer.add_scalar(
        f"Best Chromosome {suptitle_name}", best_chromosome.get_metric(), epoch
    )
    writer.add_scalar(
        "Best Chromosome Trainable Parameters",
        best_chromosome.get_trainable_parameters(),
        epoch,
    )
    # Create image
    figure = plot_chromosome(best_chromosome, image_height, suptitle_name, results_path)
    writer.add_figure("Best Chromosome", figure, epoch)
    plt.close()

setup_tensorboard ¶

setup_tensorboard(tensorboard_log_dir, logger)

Source code in eso/utils/logger.py

def setup_tensorboard(tensorboard_log_dir, logger):
    if tensorboard_log_dir is not None:
        tensorboard_log_dir = os.path.join(
            tensorboard_log_dir, datetime.now().strftime("%Y%m%d-%H%M%S")
        )
        logger.debug(f"Logging training to {tensorboard_log_dir}")
        os.makedirs(tensorboard_log_dir, exist_ok=True)
        writer = SummaryWriter(tensorboard_log_dir)
        return writer
    else:
        return None

setup_logger ¶

setup_logger(logger, log_path, log_level, name=None, add_stream_handler=True)

Source code in eso/utils/logger.py

def setup_logger(logger, log_path, log_level, name=None, add_stream_handler=True):
    if logger is not None:
        return logger
    else:
        import logging

        if name is None:
            name = __name__
        logger = logging.getLogger(name)
        logger.setLevel(log_level)


        for handler in logger.handlers[:]:
            logger.removeHandler(handler)
        if add_stream_handler:
            logger.addHandler(logging.StreamHandler())
        if log_path is not None:
            os.makedirs(log_path, exist_ok=True)
            logger.addHandler(
                logging.FileHandler(os.path.join(log_path, f"{name}.log"))
            )
    return logger

`eso.utils.logger`¶

Visualisation and logging. plot_chromosome renders the selected bands on top of a representative spectrogram, in the style of Figure 4 in the paper. setup_logger configures Python's standard logging to write to both a file and the console. setup_tensorboard and log_tensorboard push generation-level fitness scalars and the best chromosome's band layout to TensorBoard.

plot_chromosome ¶

plot_chromosome(
    chromosome, image_height, title, results_path=None, name="current_best_chromosome"
)

Source code in eso/utils/logger.py

def plot_chromosome(
    chromosome, image_height, title, results_path=None, name="current_best_chromosome"
):
    plt.figure(figsize=(4.5, 4.5))
    for gene in chromosome.get_genes():
        position = gene.get_band_position()
        height = gene.get_band_height()

        # Create a horizontal span
        plt.axhspan(position, position + height, alpha=0.5)
        plt.ylim(0, image_height)

    plt.gca().invert_yaxis()
    rounded_fitness = round(chromosome.get_fitness(), 4)
    rounded_metric = round(chromosome.get_metric(), 4)
    plt.title("Fitness: " + str(rounded_fitness))
    plt.suptitle(
        title
        + ": "
        + str(rounded_metric)
        + ";Parameters:"
        + str(chromosome.get_trainable_parameters())
    )
    plt.tight_layout()
    if results_path is not None:
        plt.gcf().savefig(os.path.join(results_path, f"{name}.png"))
    return plt.gcf()

log_tensorboard ¶

log_tensorboard(
    best_chromosome,
    epoch,
    writer,
    tensorboard_log_dir,
    image_height,
    metric_name,
    results_path=None,
)

Source code in eso/utils/logger.py

def log_tensorboard(
    best_chromosome,
    epoch,
    writer,
    tensorboard_log_dir,
    image_height,
    metric_name,
    results_path=None,
):
    if tensorboard_log_dir is None:
        return

    if metric_name == "f1":
        suptitle_name = "F1-Score"
    else:
        suptitle_name = metric_name.capitalize()

    best_chromosome_fitness = best_chromosome.get_fitness()
    writer.add_scalar("Best Chromosome Fitness", best_chromosome_fitness, epoch)

    writer.add_scalar(
        "Best Chromosome Number of Bands", best_chromosome.num_genes, epoch
    )
    writer.add_scalar(
        f"Best Chromosome {suptitle_name}", best_chromosome.get_metric(), epoch
    )
    writer.add_scalar(
        "Best Chromosome Trainable Parameters",
        best_chromosome.get_trainable_parameters(),
        epoch,
    )
    # Create image
    figure = plot_chromosome(best_chromosome, image_height, suptitle_name, results_path)
    writer.add_figure("Best Chromosome", figure, epoch)
    plt.close()

setup_tensorboard ¶

setup_tensorboard(tensorboard_log_dir, logger)

Source code in eso/utils/logger.py

def setup_tensorboard(tensorboard_log_dir, logger):
    if tensorboard_log_dir is not None:
        tensorboard_log_dir = os.path.join(
            tensorboard_log_dir, datetime.now().strftime("%Y%m%d-%H%M%S")
        )
        logger.debug(f"Logging training to {tensorboard_log_dir}")
        os.makedirs(tensorboard_log_dir, exist_ok=True)
        writer = SummaryWriter(tensorboard_log_dir)
        return writer
    else:
        return None

setup_logger ¶

setup_logger(logger, log_path, log_level, name=None, add_stream_handler=True)

Source code in eso/utils/logger.py

def setup_logger(logger, log_path, log_level, name=None, add_stream_handler=True):
    if logger is not None:
        return logger
    else:
        import logging

        if name is None:
            name = __name__
        logger = logging.getLogger(name)
        logger.setLevel(log_level)


        for handler in logger.handlers[:]:
            logger.removeHandler(handler)
        if add_stream_handler:
            logger.addHandler(logging.StreamHandler())
        if log_path is not None:
            os.makedirs(log_path, exist_ok=True)
            logger.addHandler(
                logging.FileHandler(os.path.join(log_path, f"{name}.log"))
            )
    return logger

`eso.utils.unpickler`¶

CPU_Unpickler is a pickle.Unpickler subclass that redirects GPU-tensor loads to CPU. Use it when loading a chromosome saved on a CUDA host onto a CPU-only machine for inspection or inference.

CPU_Unpickler ¶

Bases: Unpickler

find_class ¶

find_class(module, name)

Source code in eso/utils/unpickler.py

def find_class(self, module, name):
    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

    if module == "torch.storage" and name == "_load_from_bytes":
        return lambda b: torch.load(io.BytesIO(b), map_location=device)
    else:
        return super().find_class(module, name)

Utilities¶

eso.utils.preprocessing¶

AnnotationReader ¶

path instance-attribute ¶

annotation_file_name instance-attribute ¶

file_type instance-attribute ¶

audio_extension instance-attribute ¶

positive_class instance-attribute ¶

get_annotation_information ¶

get_annotation_information_testing ¶

dataframe_to_svl ¶

Preprocessing ¶

sample_rate_unpreprocessed instance-attribute ¶

species_folder instance-attribute ¶

lowpass_cutoff instance-attribute ¶

downsample_rate instance-attribute ¶

nyquist_rate instance-attribute ¶

segment_duration instance-attribute ¶

positive_class instance-attribute ¶

negative_class instance-attribute ¶

nb_negative_class instance-attribute ¶

audio_path instance-attribute ¶

annotations_path instance-attribute ¶

saved_data_path instance-attribute ¶

training_files instance-attribute ¶

n_mels instance-attribute ¶

f_min instance-attribute ¶

f_max instance-attribute ¶

file_type instance-attribute ¶

audio_extension instance-attribute ¶

apply_preprocessing instance-attribute ¶

n_fft instance-attribute ¶

hop_length instance-attribute ¶

read_audio_file ¶

butter_lowpass_filter ¶

downsample_file ¶

convert_single_to_image ¶

save_data_to_pickle ¶

load_data_from_pickle ¶

create_dataset ¶

shuffle_files_names ¶

check_distribution ¶

eso.utils.AnnotationReader¶

AnnotationReader ¶

path instance-attribute ¶

annotation_file_name instance-attribute ¶

file_type instance-attribute ¶

audio_extension instance-attribute ¶

positive_class instance-attribute ¶

get_annotation_information ¶

get_annotation_information_testing ¶

dataframe_to_svl ¶

eso.utils.settings¶

BaseConfig dataclass ¶

dict ¶

AlgorithmConfig dataclass ¶

max_generations class-attribute instance-attribute ¶

GeneticOperatorConfig dataclass ¶

mutation_rate class-attribute instance-attribute ¶

crossover_rate class-attribute instance-attribute ¶

reproduction_rate class-attribute instance-attribute ¶

mutation_height_range class-attribute instance-attribute ¶

mutation_position_range class-attribute instance-attribute ¶

SelectionOperatorConfig dataclass ¶

tournament_size class-attribute instance-attribute ¶

DataConfig dataclass ¶

force_recreate_dataset class-attribute instance-attribute ¶

keep_in_memory class-attribute instance-attribute ¶

species_folder class-attribute instance-attribute ¶

train_size class-attribute instance-attribute ¶

test_size class-attribute instance-attribute ¶

reshuffle class-attribute instance-attribute ¶

positive_class class-attribute instance-attribute ¶

negative_class class-attribute instance-attribute ¶

PreprocessingConfig dataclass ¶

sample_rate class-attribute instance-attribute ¶

lowpass_cutoff class-attribute instance-attribute ¶

downsample_rate class-attribute instance-attribute ¶

nyquist_rate class-attribute instance-attribute ¶

segment_duration class-attribute instance-attribute ¶

`eso.utils.preprocessing`¶

path `instance-attribute` ¶

annotation_file_name `instance-attribute` ¶

file_type `instance-attribute` ¶

audio_extension `instance-attribute` ¶

positive_class `instance-attribute` ¶

sample_rate_unpreprocessed `instance-attribute` ¶

species_folder `instance-attribute` ¶

lowpass_cutoff `instance-attribute` ¶

downsample_rate `instance-attribute` ¶

nyquist_rate `instance-attribute` ¶

segment_duration `instance-attribute` ¶

positive_class `instance-attribute` ¶

negative_class `instance-attribute` ¶

nb_negative_class `instance-attribute` ¶

audio_path `instance-attribute` ¶

annotations_path `instance-attribute` ¶

saved_data_path `instance-attribute` ¶

training_files `instance-attribute` ¶

n_mels `instance-attribute` ¶

f_min `instance-attribute` ¶

f_max `instance-attribute` ¶

file_type `instance-attribute` ¶

audio_extension `instance-attribute` ¶

apply_preprocessing `instance-attribute` ¶

n_fft `instance-attribute` ¶

hop_length `instance-attribute` ¶

`eso.utils.AnnotationReader`¶

path `instance-attribute` ¶

annotation_file_name `instance-attribute` ¶

file_type `instance-attribute` ¶

audio_extension `instance-attribute` ¶

positive_class `instance-attribute` ¶

`eso.utils.settings`¶

BaseConfig `dataclass` ¶

AlgorithmConfig `dataclass` ¶

max_generations `class-attribute` `instance-attribute` ¶

GeneticOperatorConfig `dataclass` ¶

mutation_rate `class-attribute` `instance-attribute` ¶

crossover_rate `class-attribute` `instance-attribute` ¶

reproduction_rate `class-attribute` `instance-attribute` ¶

mutation_height_range `class-attribute` `instance-attribute` ¶

mutation_position_range `class-attribute` `instance-attribute` ¶

SelectionOperatorConfig `dataclass` ¶

tournament_size `class-attribute` `instance-attribute` ¶

DataConfig `dataclass` ¶

force_recreate_dataset `class-attribute` `instance-attribute` ¶

keep_in_memory `class-attribute` `instance-attribute` ¶

species_folder `class-attribute` `instance-attribute` ¶

train_size `class-attribute` `instance-attribute` ¶

test_size `class-attribute` `instance-attribute` ¶

reshuffle `class-attribute` `instance-attribute` ¶

positive_class `class-attribute` `instance-attribute` ¶

negative_class `class-attribute` `instance-attribute` ¶

PreprocessingConfig `dataclass` ¶

sample_rate `class-attribute` `instance-attribute` ¶

lowpass_cutoff `class-attribute` `instance-attribute` ¶

downsample_rate `class-attribute` `instance-attribute` ¶

nyquist_rate `class-attribute` `instance-attribute` ¶

segment_duration `class-attribute` `instance-attribute` ¶

nb_negative_class `class-attribute` `instance-attribute` ¶

file_type `class-attribute` `instance-attribute` ¶

audio_extension `class-attribute` `instance-attribute` ¶

n_fft `class-attribute` `instance-attribute` ¶

hop_length `class-attribute` `instance-attribute` ¶

n_mels `class-attribute` `instance-attribute` ¶

f_min `class-attribute` `instance-attribute` ¶

f_max `class-attribute` `instance-attribute` ¶

PopulationConfig `dataclass` ¶

pop_size `class-attribute` `instance-attribute` ¶

GeneConfig `dataclass` ¶

min_position `class-attribute` `instance-attribute` ¶

max_position `class-attribute` `instance-attribute` ¶

min_height `class-attribute` `instance-attribute` ¶

max_height `class-attribute` `instance-attribute` ¶

band_position `class-attribute` `instance-attribute` ¶

band_height `class-attribute` `instance-attribute` ¶

spec_height `class-attribute` `instance-attribute` ¶

minimum_gene_height `class-attribute` `instance-attribute` ¶

ChromosomeConfig `dataclass` ¶