Preparing long recordings
Contents
Preparing long recordings#
‘Long recording segmentation’ here refers to the extraction of regions of interest from long, noisy raw recordings, along with any relevant metadata. Pykanto is agnostic as to how you find those segments; they will usually contain entire songs or calls that you want to analyse in more detail.
For this guide I have used a friendly application, sonic
visualiser, to manually draw boxes around
individual regions of interest, and store time and frequency information in .xml
files. To read these, I provide pykanto
with a custom parser,
called parse_sonic_visualiser_xml
.
This kind of manual annotation can be time-consuming, You can use pykanto
to, for example, create a training dataset for a deep
learning model, and then use segmenting information predicted by that model to
create a larger dataset in a more automated way
If you have annotation files that are formatted differently, you can either
transform them into the format used here, or write your own parser—it just needs
to return a SegmentAnnotation
object. You can
find examples of the .xml
file format in the /data
folder installed with the
package.
Segmenting files using existing .xml
metadata files.#
This requires folder(s) of audio files containing .xml files with onset, offset and frequency information for each segment of interest.
Show code cell content
from pathlib import Path
import pkg_resources
from pykanto.signal.segment import segment_files_parallel
from pykanto.utils.custom import parse_sonic_visualiser_xml
from pykanto.utils.paths import ProjDirs, get_file_paths, get_wavs_w_annotation
# Change the below to your own data directory and dataset name
dataset_name = "BENGALESE_FINCH"
data_dir = Path(pkg_resources.resource_filename("pykanto", "data"))
project_root = Path(data_dir).parent
raw_data = data_dir / "raw" / dataset_name
DIRS = ProjDirs(project_root, raw_data, dataset_name, mkdir=True)
# Find files and their metadata (assumed to be in the same directory)
wav_filepaths, xml_filepaths = [
get_file_paths(DIRS.RAW_DATA, [ext]) for ext in [".wav", ".xml"]
]
files_to_segment = get_wavs_w_annotation(wav_filepaths, xml_filepaths)
# Segment all files, ignoring "NOISE" labels and segments shorter than 0.5
# seconds or with a frequency range smaller than 200 Hz
segment_files_parallel(
files_to_segment,
DIRS,
resample=None,
parser_func=parse_sonic_visualiser_xml,
min_duration=0.5,
min_freqrange=200,
labels_to_ignore=["NOISE"],
)
And you are ready to start analysing your data!
Segmenting files with custom metadata fields#
Let’s say you are using AudioMoth recorders and want to retrieve some non-standard metadata from its audio files: (1) the device ID, and (2) the date and time time of an audio segment.
Here’s how you can do it:
Show code cell content
import datetime as dt
import re
from typing import Any, Dict
import attr
from attr import validators
from dateutil.parser import parse
from pykanto.signal.segment import ReadWav, SegmentMetadata, segment_files
from pykanto.utils.custom import parse_sonic_visualiser_xml
from pykanto.utils.paths import (get_file_paths, get_wavs_w_annotation,
pykanto_data)
from pykanto.utils.types import Annotation
from pykanto.utils.io import makedir
First, to make it easier to see what fields are available you can create a
ReadWav
object from a file and print its metadata, like so:
# Loads a sample AudioMoth file, included with pykanto
DIRS = pykanto_data(dataset="AM")
wav_dirs = get_file_paths(DIRS.RAW_DATA, extensions=['.WAV'])
meta = ReadWav(wav_dirs[0]).all_metadata
print(meta)
Show code cell outputs
<WAVE({
'filepath': '/home/nilomr/projects/pykanto/pykanto/data/raw/AM/20210502_040000.WAV',
'filesize': '92.23 KiB',
'pictures': [],
'streaminfo': <WAVEStreamInfo({
'audio_format': <WAVEAudioFormat.PCM>,
'bit_depth': 16,
'bitrate': '768 Kbps',
'channels': 1,
'duration': '00:01',
'sample_rate': '48.0 KHz',
})>,
'tags': <RIFFTags({
'ISFT': ['Lavf57.83.100'],
'artist': ['AudioMoth 247AA5075E06337D'],
'comment': [
'Recorded at 04:00:00 02/05/2021 (UTC) by AudioMoth 247AA5075E06337D at gain setting 2 while battery state was 4.2V.',
],
})>,
})>
Now let’s acess the metadata of interest and tell pykanto
that we want to add
these to the .JSON
files and, later, to our database.
First, add any new attributes, along with their data type annotations and any validators to the Annotation class. This will make sure that your new attributes, or fields, are properly parsed.
@attr.s
class CustomAnnotation(Annotation):
rec_unit: str = attr.ib(validator=validators.instance_of(str))
# This is intended as a short example, but in reality you could make sure that
# this string can be parsed as a datetime object.
datetime: str = attr.ib(validator=validators.instance_of(str))
Annotation.__init__ = CustomAnnotation.__init__
Then, monkey-patch the
get_metadata
methods of the ReadWav and SegmentMetadata classes to add any
extra fields that your project might require. This will save you from having to
define the full classes and their methods again from scratch. Some people would
say this is ugly, and I’d tend to agree, but it is the most concise way of doing
this that I could think of that still preserves full flexibility.
def ReadWav_patch(self) -> Dict[str, Any]:
comment = self.all_metadata['tags'].comment[0]
add_to_dict = {
'rec_unit': str(re.search(r"AudioMoth.(.*?) at gain", comment).group(1)),
'datetime': str(parse(re.search(r"at.(.*?) \(UTC\)", comment).group(1)))
}
return {**self.metadata.__dict__, **add_to_dict}
def SegmentMetadata_patch(self) -> Dict[str, Any]:
start = self.all_metadata.start_times[self.index] / self.all_metadata.sample_rate
datetime = parse(self.all_metadata.datetime) + dt.timedelta(seconds=start)
add_to_dict = {
'rec_unit': self.all_metadata.rec_unit,
'datetime': str(datetime),
}
return {**self.metadata.__dict__, **add_to_dict}
ReadWav.get_metadata = ReadWav_patch
SegmentMetadata.get_metadata = SegmentMetadata_patch
Now you can segment your annotated files like you would normally do - their metadata will contain your custom fields.
wav_filepaths, xml_filepaths = [get_file_paths(
DIRS.RAW_DATA, [ext]) for ext in ['.WAV', '.xml']]
files_to_segment = get_wavs_w_annotation(wav_filepaths, xml_filepaths)
wav_outdir, json_outdir = [makedir(DIRS.SEGMENTED / ext)
for ext in ["WAV", "JSON"]]
segment_files(
files_to_segment,
wav_outdir,
json_outdir,
parser_func=parse_sonic_visualiser_xml
)
Note: if you want to run this in paralell with ray (as in
segment_files_parallel
) monkey-patching will not work: for now, you will have
to properly extend ReadWav
and SegmentMetadata
.