pykanto.signal.spectrogram#

A collection of functions used to create and manipulate spectrograms.

Functions

crop_spectrogram(spectrogram[, crop_x, crop_y])

Centre crops an spectrogram to given dimensions.

cut_or_pad_spectrogram(spectrogram, length)

Cut or pad a spectrogram to be a given length.

extract_windows(spectrograms, wlength)

Extract windows from multiple spectrograms.

flatten_spectrograms(windows)

Return a numba typed list containing the 2d array collapsed into one dimension.

get_indv_units(dataset, keys, ID[, pad, ...])

Returns a spectrogram representations of the units or the average of the units present in the vocalisations of an ID in the dataset.

get_indv_units_parallel(dataset[, pad, ...])

Parallel implementation of get_indv_units().

get_unit_spectrograms(spectrogram, onsets, ...)

Get an array containing spectrograms for every unit in a given song.

get_vocalisation_units(dataset, key[, ...])

Returns spectrogram representations of the units present in a vocalisation (e.g.

pad_spectrogram(spectrogram, pad_length)

Centre pads a spectrogram to a given length.

retrieve_spectrogram(nparray_dir)

Loads an spectrogram that was saved as a numpy array.

save_melspectrogram(dataset, key[, ...])

Computes and saves a melspectrogram as a numpy array using dataset parameters.

window(spectrogram, wlength)

Extract windows of length 'wlength' from a spectrogram.

pykanto.signal.spectrogram.save_melspectrogram(dataset: KantoData, key: str, dereverb: bool = True, bandpass: bool = True) Dict[str, Path][source]#

Computes and saves a melspectrogram as a numpy array using dataset parameters.

Parameters
  • dataset (KantoData) – A KantoData object.

  • key (str) – Reference of wav file to open.

  • dereverb (bool, optional) – Whether to apply dereverberation to the spectrogram. Defaults to True.

  • bandpass (bool, optional) – Whether to bandpass the spectrogram using the minimum and maximum frequencies of the audio segment’s bounding box. Defaults to True.

Returns

Key and location of each spectrogram.

Return type

Tuple[str, Path]

pykanto.signal.spectrogram.retrieve_spectrogram(nparray_dir: pathlib.Path) numpy.ndarray[source]#

Loads an spectrogram that was saved as a numpy array.

Parameters

nparray_dir (Path) – Path to the numpy array.

Returns

Spectrogram.

Return type

np.ndarray

pykanto.signal.spectrogram.pad_spectrogram(spectrogram: np.ndarray, pad_length: int) np.ndarray[source]#

Centre pads a spectrogram to a given length.

Parameters
  • spectrogram (np.ndarray) – Spectrogram to pad.

  • pad_length (int) – Full length of padded spectrogram

Returns

Padded spectrogram

Return type

np.ndarray

pykanto.signal.spectrogram.crop_spectrogram(spectrogram: np.ndarray, crop_x: int = 0, crop_y: int = 0) np.ndarray[source]#

Centre crops an spectrogram to given dimensions.

Parameters
  • spectrogram (np.ndarray) – Spectrogram to crop.

  • crop_x (int, optional) – Final x length, > 0. Defaults to 0 (no crop).

  • crop_y (int, optional) – Final y length, > 0. Defaults to 0 (no crop).

Returns

Cropped spectrogram

Return type

np.ndarray

pykanto.signal.spectrogram.cut_or_pad_spectrogram(spectrogram: np.ndarray, length: int) np.ndarray[source]#

Cut or pad a spectrogram to be a given length.

Parameters
  • spectrogram (np.ndarray) – Spectrogram to cut or pad.

  • length (int) – Final desired lenght, in frames

Returns

Cut or padded spectrogram.

Return type

np.ndarray

pykanto.signal.spectrogram.get_unit_spectrograms(spectrogram: np.ndarray, onsets: np.ndarray, offsets: np.ndarray, sr: int = 22050, hop_length: int = 512) np.ndarray[source]#

Get an array containing spectrograms for every unit in a given song.

Parameters
  • spectrogram (np.ndarray) – Spectrogram for a single song.

  • onsets (np.ndarray) – Unit onsets, in seconds.

  • offsets (np.ndarray) – Unit offsets, in seconds.

  • sr (int) – Sampling rate, in Hz

  • hop_length (int) – Hop length, in frames.

Returns

An array of arrays, one per unit.

Return type

np.ndarray

pykanto.signal.spectrogram.get_vocalisation_units(dataset: KantoData, key: str, song_level: bool = False) Dict[str, np.ndarray | List[np.ndarray]][source]#

Returns spectrogram representations of the units present in a vocalisation (e.g. in a song) or their average.

Parameters
  • dataset (KantoData) – A KantoData object.

  • key (str) – Single vocalisation locator (key).

  • song_level (bool, optional) – Whether to return average of all units. Defaults to False.

Returns

Dictionary with key and

average of all its units if song_level = True, padded to maximum duration. If song_level = False returns a Dict with key and a list of unit spectrograms, without padding.

Return type

Dict[str, np.ndarray | List[np.ndarray]]

pykanto.signal.spectrogram.get_indv_units(dataset: KantoData, keys: List[str], ID: str, pad: bool = True, song_level: bool = False) Dict[str, Path][source]#

Returns a spectrogram representations of the units or the average of the units present in the vocalisations of an ID in the dataset. Saves the data as pickled dictionary, returns its location.

Parameters
  • dataset (KantoData) – Source dataset

  • keys (List[str]) – List of keys belonging to an ID

  • ID (str) – ID ID

  • pad (bool, optional) – Whether to pad spectrograms to the maximum lenght. Defaults to True.

  • song_level (bool, optional) – Whether to return the average of all units. Defaults to False.

Returns

ID and location of its pickled dictionary.

Return type

Dict[str, Path]

pykanto.signal.spectrogram.get_indv_units_parallel(dataset: KantoData, pad: bool = True, song_level: bool = False, num_cpus: float | None = None) Dict[str, Dict[str, Path]][source]#

Parallel implementation of get_indv_units().

pykanto.signal.spectrogram.window(spectrogram: np.ndarray, wlength: int) Iterator[np.ndarray][source]#

Extract windows of length ‘wlength’ from a spectrogram. Jitted.

Parameters
  • spectrogram (np.ndarray) – Spectrogram to window.

  • wlength (int) – Desired window length.

Yields

Iterator[np.ndarray] – A single window.

pykanto.signal.spectrogram.extract_windows(spectrograms: numba.typed.List[np.ndarray], wlength: int) Tuple[numba.typed.List[np.ndarray], List[int]][source]#

Extract windows from multiple spectrograms. Jitted.

Parameters
  • spectrograms (numba.typed.List[np.ndarray]) – Spectrogram to window.

  • wlength (int) – Desired window length.

Returns

Contains a list with the resulting windows and a list with the window counts per spectrogram.

Return type

Tuple[numba.typed.List[np.ndarray], List[int]]

pykanto.signal.spectrogram.flatten_spectrograms(windows: numba.typed.List[np.ndarray]) numba.typed.List[np.ndarray][source]#

Return a numba typed list containing the 2d array collapsed into one dimension. Jitted.

Parameters

windows (numba.typed.List[np.ndarray]) – List of 2d spectrograms.

Returns

The same list, now containing 1d spectrograms.

Return type

numba.typed.List[np.ndarray]