{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Basic workflow" ] }, { "attachments": {}, "cell_type": "markdown", "metadata": {}, "source": [ "We are going to load one of the very small datasets that are packaged with\n", "`pykanto`—this will be enough to check that everything is working as it should\n", "and to familiarise yourself with the package. See [project\n", "setup](./project-setup.md) to learn how to load your own data.\n", "\n", "```{admonition} Note:\n", ":class: note\n", "\n", "Creating a `KantoData` dataset requires that you have already set up your project directories (see [project setup](./project-setup.md)). Before either step, long files need to have been segmented into smaller chunks of interest (e.g., songs, song bouts). See [segmenting files](./segmenting-files.ipynb) for more information.\n", "Of the datasets packaged with `pykanto`, only the `GREAT_TIT` dataset has already been segmented. If you want to use another dataset, you will need to segment it first, as demonstrated in [segmenting files](./segmenting-files.ipynb) and [feature extraction](./feature-extraction.ipynb).\n", "```\n", "\n", "The `GREAT_TIT` dataset consists of a few songs from two male great tits (_Parus major_) in [my study population](http://wythamtits.com/), Wytham Woods, Oxfordshire, UK. Let's load the paths pointing to it and create a `KantoData` object:" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "tags": [] }, "outputs": [], "source": [ "from pykanto.utils.paths import pykanto_data\n", "from pykanto.dataset import KantoData\n", "from pykanto.parameters import Parameters" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "tags": [ "hide-output" ] }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "30248cdda5044587a0beccb94c5ff2e1", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Loading JSON files: 0%| | 0/20 [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "2022-11-28 15:01:24,902\tINFO services.py:1456 -- View the Ray dashboard at \u001b[1m\u001b[32mhttp://127.0.0.1:8265\u001b[39m\u001b[22m\n" ] }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "c8eea2b20ab647a8bc19ef4d96d248ae", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Preparing spectrograms: 0%| | 0/10 [00:00, ?it/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stdout", "output_type": "stream", "text": [ "Done\n" ] }, { "data": { "text/html": [ "
\n", " | species | \n", "ID | \n", "label | \n", "recorder | \n", "recordist | \n", "source_datetime | \n", "datetime | \n", "date | \n", "time | \n", "timezone | \n", "sample_rate | \n", "length_s | \n", "lower_freq | \n", "upper_freq | \n", "max_amplitude | \n", "min_amplitude | \n", "bit_depth | \n", "tech_comment | \n", "noise | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2021-B32-0415_05-11 | \n", "Great tit | \n", "B32 | \n", "\n", " | 24F319055FDF2205 | \n", "Nilo Merino Recalde | \n", "2021-04-15 05:00:00 | \n", "2021-04-15 05:07:22.866667 | \n", "2021-04-15 | \n", "05:07:22.866667 | \n", "UTC | \n", "48000 | \n", "1.139250 | \n", "2506 | \n", "5922 | \n", "0.673711 | \n", "-0.666701 | \n", "16 | \n", "Recorded at 05:00:00 15/04/2021 (UTC) by Audio... | \n", "False | \n", "
2021-B32-0415_05-15 | \n", "Great tit | \n", "B32 | \n", "\n", " | 24F319055FDF2205 | \n", "Nilo Merino Recalde | \n", "2021-04-15 05:00:00 | \n", "2021-04-15 05:08:16.520000 | \n", "2021-04-15 | \n", "05:08:16.520000 | \n", "UTC | \n", "48000 | \n", "1.194375 | \n", "2392 | \n", "5694 | \n", "0.356706 | \n", "-0.351275 | \n", "16 | \n", "Recorded at 05:00:00 15/04/2021 (UTC) by Audio... | \n", "False | \n", "
2021-B32-0415_05-21 | \n", "Great tit | \n", "B32 | \n", "\n", " | 24F319055FDF2205 | \n", "Nilo Merino Recalde | \n", "2021-04-15 05:00:00 | \n", "2021-04-15 05:09:27.600000 | \n", "2021-04-15 | \n", "05:09:27.600000 | \n", "UTC | \n", "48000 | \n", "1.188250 | \n", "2392 | \n", "5739 | \n", "0.189776 | \n", "-0.188388 | \n", "16 | \n", "Recorded at 05:00:00 15/04/2021 (UTC) by Audio... | \n", "False | \n", "