{ "cells": [ { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "# Preparing long recordings" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "'Long recording segmentation' here refers to the extraction of regions of\n", "interest from long, noisy raw recordings, along with any relevant metadata.\n", "Pykanto is agnostic as to how you find those segments; they will usually\n", "contain entire songs or calls that you want to analyse in more detail.\n", "\n", "For this guide I have used a friendly application, [sonic\n", "visualiser](https://www.sonicvisualiser.org/), to manually draw boxes around\n", "individual regions of interest, and store time and frequency information in `.xml` files. To read these, I provide `pykanto` with a custom parser,\n", "called `parse_sonic_visualiser_xml`. \n", "\n", "This kind of manual annotation can be time-consuming, You can use `pykanto` to, for example, create a training dataset for a deep\n", "learning model, and then use segmenting information predicted by that model to\n", "create a larger dataset in a more automated way\n", "\n", "If you have annotation files that are formatted differently, you can either\n", "transform them into the format used here, or write your own parser—it just needs\n", "to return a {py:class}`~pykanto.utils.types.SegmentAnnotation` object. You can\n", "find examples of the `.xml` file format in the `/data` folder installed with the\n", "package." ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "## Segmenting files using existing `.xml` metadata files.\n", "\n", "This requires folder(s) of audio files containing .xml files with onset, offset\n", "and frequency information for each segment of interest." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "hide-input", "hide-output" ] }, "outputs": [], "source": [ "from pathlib import Path\n", "import pkg_resources\n", "from pykanto.signal.segment import segment_files_parallel\n", "from pykanto.utils.custom import parse_sonic_visualiser_xml\n", "from pykanto.utils.paths import ProjDirs, get_file_paths, get_wavs_w_annotation" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "remove-output" ] }, "outputs": [], "source": [ "# Change the below to your own data directory and dataset name\n", "dataset_name = \"BENGALESE_FINCH\"\n", "data_dir = Path(pkg_resources.resource_filename(\"pykanto\", \"data\")) \n", "project_root = Path(data_dir).parent\n", "raw_data = data_dir / \"raw\" / dataset_name\n", "\n", "DIRS = ProjDirs(project_root, raw_data, dataset_name, mkdir=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "remove-output" ] }, "outputs": [], "source": [ "# Find files and their metadata (assumed to be in the same directory)\n", "wav_filepaths, xml_filepaths = [\n", " get_file_paths(DIRS.RAW_DATA, [ext]) for ext in [\".wav\", \".xml\"]\n", "]\n", "files_to_segment = get_wavs_w_annotation(wav_filepaths, xml_filepaths)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [ "remove-output" ] }, "outputs": [], "source": [ "# Segment all files, ignoring \"NOISE\" labels and segments shorter than 0.5\n", "# seconds or with a frequency range smaller than 200 Hz\n", "segment_files_parallel(\n", " files_to_segment,\n", " DIRS,\n", " resample=None,\n", " parser_func=parse_sonic_visualiser_xml,\n", " min_duration=0.5,\n", " min_freqrange=200,\n", " labels_to_ignore=[\"NOISE\"],\n", ")" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "And you are ready to start analysing your data!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Segmenting files with custom metadata fields" ] }, { "cell_type": "markdown", "metadata": { "tags": [ "hide-output" ] }, "source": [ "\n", "Let's say you are using\n", "[AudioMoth](https://www.openacousticdevices.info/audiomoth) recorders and want\n", "to retrieve some non-standard metadata from its audio files: (1) the device ID,\n", "and (2) the date and time time of an audio segment.\n", "\n", "Here's how you can do it:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [ "hide-output", "hide-input" ] }, "outputs": [], "source": [ "import datetime as dt\n", "import re\n", "from typing import Any, Dict\n", "import attr\n", "from attr import validators\n", "from dateutil.parser import parse\n", "from pykanto.signal.segment import ReadWav, SegmentMetadata, segment_files\n", "from pykanto.utils.custom import parse_sonic_visualiser_xml\n", "from pykanto.utils.paths import (get_file_paths, get_wavs_w_annotation,\n", " pykanto_data)\n", "from pykanto.utils.types import Annotation\n", "from pykanto.utils.io import makedir" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, to make it easier to see what fields are available you can create a\n", "`ReadWav` object from a file and print its metadata, like so:" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [ "hide-output" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ ",\n", " 'bit_depth': 16,\n", " 'bitrate': '768 Kbps',\n", " 'channels': 1,\n", " 'duration': '00:01',\n", " 'sample_rate': '48.0 KHz',\n", " })>,\n", " 'tags': ,\n", "})>\n" ] } ], "source": [ "# Loads a sample AudioMoth file, included with pykanto\n", "DIRS = pykanto_data(dataset=\"AM\")\n", "\n", "wav_dirs = get_file_paths(DIRS.RAW_DATA, extensions=['.WAV'])\n", "meta = ReadWav(wav_dirs[0]).all_metadata\n", "print(meta)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's acess the metadata of interest and tell `pykanto` that we want to add\n", "these to the `.JSON` files and, later, to our database.\n", "\n", "First, add any new attributes, along with their data type annotations and any\n", "validators to the Annotation class. This will make sure that your new\n", "attributes, or fields, are properly parsed." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "tags": [ "hide-output" ] }, "outputs": [], "source": [ "@attr.s\n", "class CustomAnnotation(Annotation):\n", " rec_unit: str = attr.ib(validator=validators.instance_of(str))\n", " # This is intended as a short example, but in reality you could make sure that\n", " # this string can be parsed as a datetime object.\n", " datetime: str = attr.ib(validator=validators.instance_of(str))\n", "\n", "Annotation.__init__ = CustomAnnotation.__init__" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Then, [monkey-patch](https://en.wikipedia.org/wiki/Monkey_patch) the\n", "`get_metadata` methods of the ReadWav and SegmentMetadata classes to add any\n", "extra fields that your project might require. This will save you from having to\n", "define the full classes and their methods again from scratch. Some people would\n", "say this is ugly, and I'd tend to agree, but it is the most concise way of doing\n", "this that I could think of that still preserves full flexibility." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "tags": [ "hide-output" ] }, "outputs": [], "source": [ "def ReadWav_patch(self) -> Dict[str, Any]:\n", " comment = self.all_metadata['tags'].comment[0]\n", " add_to_dict = {\n", " 'rec_unit': str(re.search(r\"AudioMoth.(.*?) at gain\", comment).group(1)),\n", " 'datetime': str(parse(re.search(r\"at.(.*?) \\(UTC\\)\", comment).group(1)))\n", " }\n", " return {**self.metadata.__dict__, **add_to_dict}\n", "\n", "\n", "def SegmentMetadata_patch(self) -> Dict[str, Any]:\n", " start = self.all_metadata.start_times[self.index] / self.all_metadata.sample_rate\n", " datetime = parse(self.all_metadata.datetime) + dt.timedelta(seconds=start)\n", " add_to_dict = {\n", " 'rec_unit': self.all_metadata.rec_unit,\n", " 'datetime': str(datetime),\n", " }\n", " return {**self.metadata.__dict__, **add_to_dict}\n", "\n", "\n", "ReadWav.get_metadata = ReadWav_patch\n", "SegmentMetadata.get_metadata = SegmentMetadata_patch" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now you can segment your annotated files like you would normally do - their\n", "metadata will contain your custom fields." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "tags": [ "remove-output" ] }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Found 1 .WAV files in /home/nilomr/projects/pykanto/pykanto/data/raw/AM\n", "Found 1 .xml files in /home/nilomr/projects/pykanto/pykanto/data/raw/AM\n", "Finding and saving audio segments and their metadata: 100%|██████████| 1/1 [00:00<00:00, 30.90it/s]\n" ] } ], "source": [ "wav_filepaths, xml_filepaths = [get_file_paths(\n", " DIRS.RAW_DATA, [ext]) for ext in ['.WAV', '.xml']]\n", "files_to_segment = get_wavs_w_annotation(wav_filepaths, xml_filepaths)\n", "\n", "wav_outdir, json_outdir = [makedir(DIRS.SEGMENTED / ext)\n", " for ext in [\"WAV\", \"JSON\"]]\n", "\n", "segment_files(\n", " files_to_segment,\n", " wav_outdir,\n", " json_outdir,\n", " parser_func=parse_sonic_visualiser_xml\n", ")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note:** if you want to run this in paralell with ray (as in\n", "`segment_files_parallel`) monkey-patching will not work: for now, you will have\n", "to properly extend `ReadWav` and `SegmentMetadata`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3.9.12 ('pykanto-dev')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.15 (main, Nov 24 2022, 14:31:59) \n[GCC 11.2.0]" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "cf30c6a63fc6852a8d910622565c3348d4a7fab8fc38710c97d8db63a595f32d" } } }, "nbformat": 4, "nbformat_minor": 2 }