pykanto.signal.analysis#

Basic audio feature calculations (spectral centroids, peak frequencies, etc.)

Functions

`approximate_minmax_frequency`(dataset[, key, ...])	Calculate approximate minimum and maximum frequencies from a mel spectrogram.
`get_mean_sd_mfcc`(S, n_mfcc)	Extract the mean and SD of n Mel-frequency cepstral coefficients (MFCCs) calculated ffrom a log-power Mel spectrogram.
`get_peak_freqs`(dataset, spectrograms[, ...])	Return the peak frequencies of each spectrogram in an array of spectrograms.
`spec_centroid_bandwidth`(dataset[, key, ...])	Calculate a vocalisation's spectral centroid and bandwidth from a mel spectrogram.

pykanto.signal.analysis.get_peak_freqs(dataset: KantoData, spectrograms: np.ndarray, melscale: bool = True, threshold: float = 0.3) → np.ndarray[source]#

Return the peak frequencies of each spectrogram in an array of spectrograms.

Parameters

dataset (KantoData) – Vocalisation dataset.
spectrograms (np.ndarray) – Array of np.ndarray spectrograms.
melscale (bool, optional) – Are the spectrograms in the mel scale? Defaults to True.
threshold (float, optional) – Threshold for peak detection. Defaults to 0.3.

Returns

Array with peak frequencies.

Return type

np.ndarray

pykanto.signal.analysis.spec_centroid_bandwidth(dataset: KantoData, key: None | str = None, spec: None | np.ndarray = None, plot: bool = False) → Tuple[np.ndarray, np.ndarray][source]#

Calculate a vocalisation’s spectral centroid and bandwidth from a mel spectrogram. You can either provide a key string for a vocalisation or its mel spectrogram directly.

Parameters

dataset (KantoData) – Dataset object with your data.
key (None | str = None) – Key of a vocalisation. Defaults to None.
(spec (spec) – None | np.ndarray): Mel spectrogram. Defaults to None.
plot (bool, optional) – Whether to show the result. Defaults to False.

Returns

A tuple with the centroids: and bandwidths.

Return type

Tuple[np.ndarray, np.ndarray]

pykanto.signal.analysis.get_mean_sd_mfcc(S: numpy.ndarray, n_mfcc: int) → numpy.ndarray[source]#

Extract the mean and SD of n Mel-frequency cepstral coefficients (MFCCs) calculated ffrom a log-power Mel spectrogram.

Parameters

S (np.ndarray) – A log-power Mel spectrogram.
n_mfcc (int) – Number of coefficients to return.

Returns

Array containing mean and std of each coefficient (len = n_mfcc*2).

Return type

np.ndarray

pykanto.signal.analysis.approximate_minmax_frequency(dataset: KantoData, key: None | str = None, spec: None | np.ndarray = None, roll_percents: list[float] = [0.95, 0.1], plot: bool = False) → Tuple[np.ndarray, np.ndarray][source]#

Calculate approximate minimum and maximum frequencies from a mel spectrogram. You can either provide a key string for a vocalisation or its mel spectrogram directly.

Parameters

dataset (KantoData) – Dataset object with your data.
key (None | str = None) – Key of a vocalisation. Defaults to None.
(spec (spec) – None | np.ndarray): Mel spectrogram. Defaults to None.
roll_percents (list[float, float], optional) – Percentage of energy contained in bin. Defaults to [0.95, 0.1].
plot (bool, optional) – Whether to show the result. Defaults to False.

Returns

A tuple with the approximate minimum and: maximum frequencies, in this order.

Return type

Tuple[np.ndarray, np.ndarray]