pykanto.utils.io
pykanto.utils.io#
Functions to read external files -e.g. JSON- efficiently.
Functions
|
Copies a list of files to |
|
Retrieves unit (e.g. individual notes) spectrograms for a grouping ID in a |
|
Load an existing dataset, fixing any broken links to data using a new ProjDirs object. |
|
Makes a tarfile from a given directory. |
|
Make a safely nested directory. |
|
Reads a .json file using ujson. |
|
Saves a .json file using ujson. |
|
Save song spectrograms as .jpg images to folder. |
|
Save train and test subsets of dataset to disk as .jpg images (in folders correspoding to class labels). |
|
Appends new metadata generated in pykanto to the original json metadata files that were used to create a |
Classes
|
Stores a numpy.ndarray or any nested-list composition as JSON. |
- pykanto.utils.io.load_dataset(dataset_dir: Path, DIRS: ProjDirs, relink_data: bool = True) KantoData [source]#
Load an existing dataset, fixing any broken links to data using a new ProjDirs object.
- pykanto.utils.io.read_json(json_loc: pathlib.Path) Dict [source]#
Reads a .json file using ujson.
- Parameters
json_loc (Path) – Path to json file.
- Returns
Json file as a dictionary.
- Return type
Dict
- pykanto.utils.io.makedir(DIR: Path, return_path: bool = True) Path | None [source]#
Make a safely nested directory. Returns the Path object by default. Modified from code by Tim Sainburg (source).
- Parameters
DIR (Path) – Path to be created. return_path (bool, optional): Whether to
True. (return the path. Defaults to) –
- Raises
TypeError – Wrong argument type to ‘DIR’
- Returns
Path to file or directory.
- Return type
Path
- pykanto.utils.io.copy_xml_files(file_list: List[pathlib.Path], dest_dir: pathlib.Path) None [source]#
Copies a list of files to
dest_dir / file.parent.name / file.name
- Parameters
file_list (List[Path]) – List of files to be copied.
dest_dir (Path) – Path to destination folder, will create it if doesn’t exist.
- pykanto.utils.io.save_json(json_object: Dict, json_loc: pathlib.Path) None [source]#
Saves a .json file using ujson.
- Parameters
json_loc (Path) – Path to json file.
- Returns
Json file as a dictionary.
- Return type
Dict
- pykanto.utils.io.save_to_jsons(dataset: KantoData) None [source]#
Appends new metadata generated in pykanto to the original json metadata files that were used to create a
KantoData
dataset. These usually include things like type labels and unit oset/offsets.- Parameters
dataset (KantoData) – Dataset object.
- class pykanto.utils.io.NumpyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]#
Stores a numpy.ndarray or any nested-list composition as JSON. Source: karlB on Stack Overflow.
Extends the json.JSONEncoder class.
- default(obj)[source]#
Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- pykanto.utils.io.make_tarfile(source_dir: pathlib.Path, output_filename: pathlib.Path) None [source]#
Makes a tarfile from a given directory. Source: ` George V. Reilly on stack overflow <https://stackoverflow.com/a/17081026>`_.
- Parameters
source_dir (Path) – Directory to tar
output_filename (Path) – Name of output file (e.g. file.tar.gz).
- pykanto.utils.io.get_unit_spectrograms(dataset: KantoData, ID: str) Dict[str, np.ndarray] [source]#
- Retrieves unit (e.g. individual notes) spectrograms for a grouping ID in a
dataset.
- Parameters
dataset (KantoData) – Dataset to use.
ID (str) – Which id to use (present in an ID column in the dataset)
- Returns
- A dictionary of spectrograms, keyed by
vocalisation index.
- Return type
Dict[str, np.ndarray]
Example
>>> units = get_unit_spectrograms(dataset, "BIGBIRD") >>> last_note = units["BIGBIRD_0"][-1]
- pykanto.utils.io.save_songs(folder: pathlib.Path, specs: List[pathlib.Path]) None [source]#
Save song spectrograms as .jpg images to folder.
- Parameters
folder (Path) – Path to destination folder.
specs (List[Path]) – List of spectrogram paths.
- pykanto.utils.io.save_subset(train_dir: pathlib.Path, test_dir: pathlib.Path, dname: str, to_export: ItemsView[str, List[pathlib.Path]]) None [source]#
Save train and test subsets of dataset to disk as .jpg images (in folders correspoding to class labels).
- Parameters
train_dir (Path) – Destination folder for training data.
test_dir (Path) – Destination folder for test data.
dname (str) – Name of subset, one of “train” or “test”.
to_export (ItemsView[str, List[Path]]) – Subset of dataset to export.