{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Segmenting vocalisations" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "> 'Vocalisation' is a common but slightly vague term. In this guide, I use it to\n", "> refer to a single song or call, while 'unit' refers to a single note or\n", "> syllable.\n", "\n", "## Segmenting vocalisations into units\n", "\n", "Following the example in the previous section, we can first create a project\n", "structure:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [ "hide-input", "hide-output" ] }, "outputs": [], "source": [ "from pathlib import Path\n", "import pkg_resources\n", "from pykanto.utils.paths import ProjDirs, get_file_paths, get_wavs_w_annotation\n", "from pykanto.dataset import KantoData\n", "from pykanto.parameters import Parameters" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# Prepare a project:\n", "DATASET_ID = \"BENGALESE_FINCH\"\n", "DATA_PATH = Path(pkg_resources.resource_filename(\"pykanto\", \"data\")) \n", "PROJECT_ROOT = Path(DATA_PATH).parent\n", "RAW_DATA = DATA_PATH / \"raw\" / DATASET_ID\n", "DIRS = ProjDirs(PROJECT_ROOT, RAW_DATA, DATASET_ID, mkdir=True)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Then, we can set audio parameters and load the audio files into a `KantoData`\n", "object:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "tags": [ "hide-output" ] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6c33f64d8ca148298b712f65125f51a5", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Loading JSON files: 0%| | 0/2 [00:00" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Plot the first vocalisation in the dataset\n", "key = dataset.data.index[0]\n", "dataset.plot(key, max_lenght =5)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "Finally, to segment the vocalisations into units, we do the following:" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "0250eabcf68440088c3955adec1c6306", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Finding units in vocalisations: 0%| | 0/2 [00:00" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# Plot another vocalisation in the dataset, now segmented:\n", "key = dataset.data.index[1]\n", "dataset.plot(key, segmented=True, max_lenght =5)" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "If you want to access the onsets or offsets manually, simply get them from the dataset:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "First 10 note onset times: \n", "[0.516 0.68 0.836 1.02 1.204 1.376 1.536 1.712 1.876 2.048]\n" ] } ], "source": [ "\n", "print(f\"First 10 note onset times: \\n{dataset.data.loc[key, 'onsets'][:10]}\")\n" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "```{admonition} Note: \n", ":class: note\n", "\n", "This method works reasonably well, especially after some fine-tuning of the parameters (see {py:class}`~pykanto.parameters.Parameters`). However, it will necessarily fail in difficult cases, for example if the SNR is very bad of there is a lot of amplitude modulation within notes. In such cases you might have to resort to manual segmentation, using a tool such as [sonic visualiser](https://www.sonicvisualiser.org/), or train a model to work with your specific species and recording conditions. Here is a very good option if you want to do the latter:\n", "\n", "TweetyNet: a single neural network model that learns how to segment spectrograms of birdsong into annotated syllables.\n", "[elifesciences.org/articles/63853](https://elifesciences.org/articles/63853)\n", "```\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.12 ('pykanto-dev')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.15" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "cf30c6a63fc6852a8d910622565c3348d4a7fab8fc38710c97d8db63a595f32d" } } }, "nbformat": 4, "nbformat_minor": 2 }