Title: | Management and Processing of Autonomous Recording Unit (ARU) Data |
---|---|
Description: | Parse Autonomous Recording Unit (ARU) data and for sub-sampling recordings. Extract Metadata from your recordings, select a subset of recordings for interpretation, and prepare files for processing on the 'WildTrax' <https://wildtrax.ca/> platform. Read and process metadata from recordings collected using the SongMeter and BAR-LT types of ARUs. |
Authors: | David Hope [aut, cre] , Steffi LaZerte [aut] , Government of Canada [cph, fnd] |
Maintainer: | David Hope <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.7.1.9001 |
Built: | 2025-01-07 06:02:16 UTC |
Source: | https://github.com/arutools/ARUtools |
Wrapper for 'soundecology' package to calculate acoustic complexity, the bioacoustic index, and acoustic diversity. See Value for details about these indices.
acoustic_indices( path, min_freq = NA, max_freq = NA, units = "samples", quiet = FALSE )
acoustic_indices( path, min_freq = NA, max_freq = NA, units = "samples", quiet = FALSE )
path |
Character. Path to wave file. |
min_freq |
Numeric. Minimum frequency for acoustic complexity (see
|
max_freq |
Numeric. Maximum frequency for acoustic complexity (see
|
units |
Character. Wave file units for reading the file. Defaults to
"samples" (see |
quiet |
Logical. Whether to suppress progress messages and other non-essential updates. |
Returns a data frame with acoustic indices. Those prefaced with
complx_
are from soundecology::acoustic_complexity()
bio_
are from soundecology::bioacoustic_index()
div_
are from soundecology::acoustic_diversity()
w <- tuneR::sine(440, duration = 300000) # > 5s tuneR::writeWave(w, "test_wave.wav") acoustic_indices("test_wave.wav") acoustic_indices("test_wave.wav", quiet = TRUE) unlink("test_wave.wav")
w <- tuneR::sine(440, duration = 300000) # > 5s tuneR::writeWave(w, "test_wave.wav") acoustic_indices("test_wave.wav") acoustic_indices("test_wave.wav", quiet = TRUE) unlink("test_wave.wav")
Add an ARU to the list of identified ARUs
add_pattern_aru_type(pattern, aru_type)
add_pattern_aru_type(pattern, aru_type)
pattern |
regular expression to extract from file path |
aru_type |
Name of ARUtype |
org_pat <- get_pattern("pattern_aru_type") print(org_pat) add_pattern_aru_type("CWS\\d", "Canadian Wildlife Detector \1") get_pattern("pattern_aru_type") set_pattern("pattern_aru_type", org_pat)
org_pat <- get_pattern("pattern_aru_type") print(org_pat) add_pattern_aru_type("CWS\\d", "Canadian Wildlife Detector \1") get_pattern("pattern_aru_type") set_pattern("pattern_aru_type", org_pat)
Uses dates to join site-level data (coordinates and site ids) to the meta
data. If the site data have only single dates, then a buffer before and after
is used to determine which recordings belong to that site observation. Can
join by site ids alone if set by_date = NULL
.
add_sites( meta, sites, buffer_before = 0, buffer_after = NULL, by = c("site_id", "aru_id"), by_date = "date_time", quiet = FALSE )
add_sites( meta, sites, buffer_before = 0, buffer_after = NULL, by = c("site_id", "aru_id"), by_date = "date_time", quiet = FALSE )
meta |
Data frame. Recording metadata. Output of |
sites |
Data frame. Site-level data from |
buffer_before |
Numeric. Number of hours before a deployment in which to
include recordings. |
buffer_after |
Numeric. Number of hours after the deployment in which to
include recordings. |
by |
Character. Columns which identify a deployment in |
by_date |
Character. Date/time type to join data by. |
quiet |
Logical. Whether to suppress progress messages and other non-essential updates. |
A data frame of metadata with site-level data joined in.
m <- clean_metadata(project_files = example_files) s <- clean_site_index(example_sites_clean, name_date = c("date_time_start", "date_time_end") ) m <- add_sites(m, s) # Without dates (by site only) m <- clean_metadata(project_files = example_files) eg <- dplyr::select(example_sites_clean, -date_time_start, -date_time_end) s <- clean_site_index(eg, name_date_time = NULL) m <- add_sites(m, s, by_date = NULL)
m <- clean_metadata(project_files = example_files) s <- clean_site_index(example_sites_clean, name_date = c("date_time_start", "date_time_end") ) m <- add_sites(m, s) # Without dates (by site only) m <- clean_metadata(project_files = example_files) eg <- dplyr::select(example_sites_clean, -date_time_start, -date_time_end) s <- clean_site_index(eg, name_date_time = NULL) m <- add_sites(m, s, by_date = NULL)
Create and append file name appropriate for uploading data to the Wildtrax platform https://wildtrax.ca/.
add_wildtrax(meta)
add_wildtrax(meta)
meta |
Data frame. Recording metadata. Output of |
Data frame of metadata with appended column of WildTrax appropriate file names.
m <- clean_metadata(project_files = example_files) m <- add_wildtrax(m) m
m <- clean_metadata(project_files = example_files) m <- add_wildtrax(m) m
Parse Autonomous Recording Unit (ARU) data and for sub-sampling recordings. Extract Metadata from your recordings, select a subset of recordings for interpretation, and prepare files for processing on the WildTrax https://wildtrax.ca/ platform. Read and process metadata from recordings collected using the Song Meter and BAR-LT types of ARUs.
Maintainer: David Hope [email protected] (ORCID)
Authors:
Steffi LaZerte [email protected] (ORCID)
Other contributors:
Government of Canada [copyright holder, funder]
Useful links:
Report bugs at https://github.com/ARUtools/ARUtools/issues
Calculate selection weights for a series of recordings based on the selection
parameters defined by sim_selection_weights()
.
calc_selection_weights( meta_sun, params, col_site_id = site_id, col_min = t2sr, col_day = date )
calc_selection_weights( meta_sun, params, col_site_id = site_id, col_min = t2sr, col_day = date )
meta_sun |
(Spatial) Data frame. Recording meta data with time to
sunrise/sunset. Output of |
params |
Named list. Parameters created by |
col_site_id |
Column. Unquoted column containing site strata IDs
(defaults to |
col_min |
Column. Unquoted column containing minutes to sunrise ( |
col_day |
Column. Unquoted column containing dates or day-of-year (doy)
to use (defaults to |
Returns data with appended selection weights columns:
psel_by
- The minutes column used
psel_min
- Probability of selection by time of day (min column)
psel_doy
- Probability of selection by day of year
psel
- Probability of selection overall
psel_scaled
- Probability of selection scaled overall
psel_std
- Probability of selection standardized within a site
psel_normalized
- Probability of selection normalized within a site
s <- clean_site_index(example_sites_clean, name_date_time = c("date_time_start", "date_time_end") ) m <- clean_metadata(project_files = example_files) |> add_sites(s) |> calc_sun() params <- sim_selection_weights() calc_selection_weights(m, params = params)
s <- clean_site_index(example_sites_clean, name_date_time = c("date_time_start", "date_time_end") ) m <- clean_metadata(project_files = example_files) |> add_sites(s) |> calc_sun() params <- sim_selection_weights() calc_selection_weights(m, params = params)
Calculate the sunrise/sunset of each sound file for the day of, the day before and the day after to get the nearest sunrise to the recording. Times are calculated using the 'suncalc' package.
calc_sun(meta_sites, aru_tz = "local")
calc_sun(meta_sites, aru_tz = "local")
meta_sites |
(Spatial) Data frame. Recording metadata with added
coordinates. Output of |
aru_tz |
Character. Must be either "local" or a timezone listed in
|
Timezones. To ensure that the sunrise/sunset times are calculated
correctly relative to the time of the recording, we need to know the
timezone of the date/time of the recording. If ARUs were calibrated with a
specific timezone before going into the field, that can be specified by
using, for example, aru_tz = "America/Toronto"
. If on the other hand each
ARU was calibrated to whichever timezone was local when it was deployed use
aru_tz = "local"
. The specific timezone will be calculated individually
based on the longitude and latitude of each recording.
Data frame with metadata and added timezone of recording time (tz
),
and time to sunrise/sunset (t2sr
, t2ss
).
s <- clean_site_index(example_sites_clean, name_date = c("date_time_start", "date_time_end") ) m <- clean_metadata(project_files = example_files) |> add_sites(s) calc_sun(m)
s <- clean_site_index(example_sites_clean, name_date = c("date_time_start", "date_time_end") ) m <- clean_metadata(project_files = example_files) |> add_sites(s) calc_sun(m)
Shows the first few lines in a text file. Useful for trying to understand problems in GPS files.
check_file(file_name, n_max = 10, ...)
check_file(file_name, n_max = 10, ...)
file_name |
Character. File path to check. |
n_max |
Numeric. Number of lines in the file to show. Default 10. |
... |
Arguments passed on to |
Wrapper around readr::read_lines(n_max)
.
A character vector with one element for each line
f <- system.file("extdata", "logfile_00015141_SD1.txt", package = "ARUtools") check_file(f)
f <- system.file("extdata", "logfile_00015141_SD1.txt", package = "ARUtools") check_file(f)
clean_metadata()
Cleaning metadata can take a series of tries. This function helps summarize and explore the metadata for possible patterns which may help find problems.
check_meta(meta, date = FALSE)
check_meta(meta, date = FALSE)
meta |
Data frame. Recording metadata. Output of |
date |
Logical. Whether to summarize output by date (as well as
|
A data frame summarizing the metadata by site_id, aru_type, aru_id, and (optionally) by date. Presents the number of files, directories, and days worth of recordings, as well as the minimum and maximum recording times.
m <- clean_metadata(project_files = example_files) check_meta(m) check_meta(m, date = TRUE)
m <- clean_metadata(project_files = example_files) check_meta(m) check_meta(m, date = TRUE)
clean_metadata()
Cleaning metadata can take a series of tries. This function helps summarize and explore missing metadata (problems).
check_problems( df, check = c("site_id", "aru_id", "date", "date_time", "longitude", "latitude"), path = FALSE, date = FALSE )
check_problems( df, check = c("site_id", "aru_id", "date", "date_time", "longitude", "latitude"), path = FALSE, date = FALSE )
df |
Data frame. Either meta data ( |
check |
Character. Character vector of columns to check for missing
values. Default is |
path |
Logical. Whether to return just the file paths which have missing
attributes. Default |
date |
Logical. Whether to summarize output by date (as well as
|
A data frame summarizing the metadata by site_id, aru_type, aru_id, and (optionally) by date. Presents the number of files, directories, and days worth of recordings, as well as the minimum and maximum recording times.
m <- clean_metadata(project_files = example_files, pattern_aru_id = "test") check_problems(m) check_problems(m, date = TRUE) check_problems(m, path = TRUE)
m <- clean_metadata(project_files = example_files, pattern_aru_id = "test") check_problems(m) check_problems(m, date = TRUE) check_problems(m, path = TRUE)
Check and clean GPS data from ARU logs. GPS points are checked for obvious
problems (expected range, distance cutoffs and timing) then attached to the
meta data frame. Note that it is often safer and more reliable to create
your own Site Index file including site ids, and GPS coordinates. This file
can be cleaned and prepared with clean_site_index()
instead.
clean_gps( meta = NULL, dist_cutoff = 100, dist_crs = 3161, dist_by = c("site_id", "aru_id"), quiet = FALSE, verbose = FALSE )
clean_gps( meta = NULL, dist_cutoff = 100, dist_crs = 3161, dist_by = c("site_id", "aru_id"), quiet = FALSE, verbose = FALSE )
meta |
Data frame. Output of |
dist_cutoff |
Numeric. Maximum distance (m) between GPS points within a
site. Default is 100m but can be set to |
dist_crs |
Numeric. Coordinate Reference System to use when calculating distance (should be one with m). |
dist_by |
Character. Column which identifies sites within which to
compare distance among GPS points. Only valid if |
quiet |
Logical. Whether to suppress progress messages and other non-essential updates. |
verbose |
Logical. Show extra loading information. Default |
If checking for a maximum distance (dist_cutoff
) among GPS points within a
group (dist_by
), the returned data frame will include a column max_dist
,
which represents the largest distance among points within that group.
Data frame of site-level metadata.
m <- clean_metadata(project_dir = "my_project") g <- clean_gps(meta = m)
m <- clean_metadata(project_dir = "my_project") g <- clean_gps(meta = m)
Process BAR-LT log files into a data frame reflecting metadata, schedule
information, and events. Events are time-stamped logs of either GPS fixes
(lat
and lon
) or recordings (rec_file
, rec_size
, rec_end
).
clean_logs(log_files, return = "all", progress = TRUE)
clean_logs(log_files, return = "all", progress = TRUE)
log_files |
Character vector of log files to process. |
return |
Character. What kind of data to return, GPS fixes ( |
progress |
Logical. Whether to use |
Note that log files can have glitches. If there is no start time for a
recording (generally when there is a problem and no recording is made), the
date_time
value for that recording will be the same as the rec_end
time.
Because the BAR-LT units adjust their time according to the GPS locations, all times are in "local" to that area.
Data frame containing
file_name
s and path
s of the log files
event
s and their date_time
s
lat
and lon
for "gps" events
rec_file
, rec_size
and rec_end
for "recording" events
(recording start is the date_time
of the event)
schedule
information such as schedule_date
, schedule_name
,
schedule_lat
, schedule_lon
, schedule_sr
(sunrise),
and schedule_ss
(sunset)
meta
data information such as meta_serial
and meta_firmware
# Replace "my_project_folder" with your directory containing your recordings and logfiles log_files <- fs::dir_ls("my_project_folder", recurse = TRUE, glob = "*logfile*") log_files logs <- clean_logs(log_files) log_files <- "../ARUtools - Extra/aru_log_files/P028/1A_BARLT10962/logfile_00010962_SD1.txt" clean_logs(log_files) clean_logs(log_files, return = "gps") clean_logs(log_files, return = "recordings") log_files <- fs::dir_ls("../ARUtools - Extra/aru_log_files/", recurse = TRUE, glob = "*logfile*") l <- clean_logs(log_files)
# Replace "my_project_folder" with your directory containing your recordings and logfiles log_files <- fs::dir_ls("my_project_folder", recurse = TRUE, glob = "*logfile*") log_files logs <- clean_logs(log_files) log_files <- "../ARUtools - Extra/aru_log_files/P028/1A_BARLT10962/logfile_00010962_SD1.txt" clean_logs(log_files) clean_logs(log_files, return = "gps") clean_logs(log_files, return = "recordings") log_files <- fs::dir_ls("../ARUtools - Extra/aru_log_files/", recurse = TRUE, glob = "*logfile*") l <- clean_logs(log_files)
Using regular expressions, metadata is extracted from file names and directory structure, checked and cleaned.
clean_metadata( project_dir = NULL, project_files = NULL, file_type = "wav", subset = NULL, subset_type = "keep", pattern_site_id = create_pattern_site_id(), pattern_aru_id = create_pattern_aru_id(), pattern_date = create_pattern_date(), pattern_time = create_pattern_time(), pattern_dt_sep = create_pattern_dt_sep(), pattern_tz_offset = create_pattern_tz_offset(), order_date = "ymd", quiet = FALSE )
clean_metadata( project_dir = NULL, project_files = NULL, file_type = "wav", subset = NULL, subset_type = "keep", pattern_site_id = create_pattern_site_id(), pattern_aru_id = create_pattern_aru_id(), pattern_date = create_pattern_date(), pattern_time = create_pattern_time(), pattern_dt_sep = create_pattern_dt_sep(), pattern_tz_offset = create_pattern_tz_offset(), order_date = "ymd", quiet = FALSE )
project_dir |
Character. Directory where project files are stored. File paths will be used to extract information and must actually exist. |
project_files |
Character. Vector of project file paths. These paths can
be absolute or relative to the working directory, and don't actually need
to point to existing files unless you plan to use |
file_type |
Character. Type of file (extension) to summarize. Default wav. |
subset |
Character. Text pattern to mark a subset of files/directories
to either |
subset_type |
Character. Either |
pattern_site_id |
Character. Regular expression to extract site ids. See
|
pattern_aru_id |
Character. Regular expression to extract ARU ids. See
|
pattern_date |
Character. Regular expression to extract dates. See
|
pattern_time |
Character. Regular expression to extract times. See
|
pattern_dt_sep |
Character. Regular expression to mark separators
between dates and times. See |
pattern_tz_offset |
Character. Regular expression to extract time zone
offsets from file names. See. |
order_date |
Character. Order that the date appears in. "ymd" (default), "mdy", or "dmy". Can be a vector of multiple patterns to match. |
quiet |
Logical. Whether to suppress progress messages and other non-essential updates. |
Note that times are extracted by first combining the date, date/time separator and the time patterns. This means that if there is a problem with this combination, dates might be extracted but date/times will not. This mismatch can be used to determine which part of a pattern needs to be tweaked.
See vignette("customizing", package = "ARUtools")
for details on
customizing clean_metadata()
for your project.
Data frame with extracted metadata
clean_metadata(project_files = example_files) clean_metadata(project_files = example_files, subset = "P02")
clean_metadata(project_files = example_files) clean_metadata(project_files = example_files, subset = "P02")
A site index file contains information on when specific ARUs were deployed
where. This function cleans a file (csv, xlsx) or data frame in preparation
for adding these details to the output of clean_metadata()
. It can be used
to specify missing information according to date, such as GPS lon/lats and
site ids.
clean_site_index( site_index, name_aru_id = "aru_id", name_site_id = "site_id", name_date_time = "date", name_coords = c("longitude", "latitude"), name_extra = NULL, resolve_overlaps = TRUE, quiet = FALSE )
clean_site_index( site_index, name_aru_id = "aru_id", name_site_id = "site_id", name_date_time = "date", name_coords = c("longitude", "latitude"), name_extra = NULL, resolve_overlaps = TRUE, quiet = FALSE )
site_index |
(Spatial) Data frame or file path. Site index data to clean. If file path, must be to a local csv or xlsx file. |
name_aru_id |
Character. Name of the column that contains ARU ids.
Default |
name_site_id |
Character. Name of the column that contains site ids.
Default |
name_date_time |
Character. Column name that contains dates or
date/times. Can be vector of two names if there are both 'start' and 'end'
columns. Can be |
name_coords |
Character. Column names that contain longitude and
latitude (in that order). Ignored if |
name_extra |
Character. Column names for extra data to include. If a named
vector, will rename the columns (see examples). Default |
resolve_overlaps |
Logical. Whether or not to resolve date overlaps by
shifting the start/end dates to noon (default |
quiet |
Logical. Whether to suppress progress messages and other non-essential updates. |
Note that times are assumed to be in 'local' time and a timezone isn't used (and is removed if present, replaced with UTC). This allows sites from different timezones to be processed at the same time.
Standardized site index data frame
s <- clean_site_index(example_sites, name_aru_id = "ARU", name_site_id = "Sites", name_date_time = c("Date_set_out", "Date_removed"), name_coords = c("lon", "lat") ) s <- clean_site_index(example_sites, name_aru_id = "ARU", name_site_id = "Sites", name_date_time = c("Date_set_out", "Date_removed"), name_coords = c("lon", "lat"), name_extra = c("plot" = "Plots") ) # Without dates eg <- dplyr::select(example_sites, -Date_set_out, -Date_removed) s <- clean_site_index(eg, name_aru_id = "ARU", name_site_id = "Sites", name_date_time = NULL, name_coords = c("lon", "lat"), name_extra = c("plot" = "Plots") )
s <- clean_site_index(example_sites, name_aru_id = "ARU", name_site_id = "Sites", name_date_time = c("Date_set_out", "Date_removed"), name_coords = c("lon", "lat") ) s <- clean_site_index(example_sites, name_aru_id = "ARU", name_site_id = "Sites", name_date_time = c("Date_set_out", "Date_removed"), name_coords = c("lon", "lat"), name_extra = c("plot" = "Plots") ) # Without dates eg <- dplyr::select(example_sites, -Date_set_out, -Date_removed) s <- clean_site_index(eg, name_aru_id = "ARU", name_site_id = "Sites", name_date_time = NULL, name_coords = c("lon", "lat"), name_extra = c("plot" = "Plots") )
Process multiple wave files by copying them with a new filename and clipping to a given length.
clip_wave( waves, dir_out, dir_in = NULL, col_path_in = path, col_subdir_out = subdir_out, col_filename_out = filename_out, col_clip_length = clip_length, col_start_time = start_time, overwrite = FALSE, create_dir = TRUE, diff_limit = 30 )
clip_wave( waves, dir_out, dir_in = NULL, col_path_in = path, col_subdir_out = subdir_out, col_filename_out = filename_out, col_clip_length = clip_length, col_start_time = start_time, overwrite = FALSE, create_dir = TRUE, diff_limit = 30 )
waves |
Data frame. Details of file locations. |
dir_out |
Character. Output directory. |
dir_in |
Character. Directory wave files are read from. Default is
|
col_path_in |
Column. Unquoted column containing the current file paths.
Default |
col_subdir_out |
Column. Unquoted column containing the
subdirectories in which to put output files. Default |
col_filename_out |
Column. Unquoted column containing the output
filenames. Default |
col_clip_length |
Column. Unquoted column containing the length of the
new clip. Default |
col_start_time |
Column. Unquoted column containing the start time of
the new clip. Default |
overwrite |
Logical. Overwrite pre-existing files when clipping and
moving. Default |
create_dir |
Logical. Whether to create directory structure for newly formatted and clipped wave files. |
diff_limit |
Numeric. How much longer in seconds clip lengths can be
compared to file lengths before triggering an error. Default |
TRUE if successful and clipped wave files created
w <- data.frame( path = temp_wavs(n = 4), subdir_out = c("test1/a", "test2/a", "test3/c", "test4/d"), subsub_dir_out = rep("zz", 4), filename_out = c("wave1_clean.wav", "wave2_clean.wav", "wave3_clean.wav", "wave4_clean.wav"), clip_length = c(1, 1, 1, 2), start_time = c(1.2, 0.5, 1, 0) ) clip_wave(w, dir_out = "clean", col_subdir_out = c(subdir_out, subsub_dir_out)) unlink("clean", recursive = TRUE) # Remove this new 'clean' directory
w <- data.frame( path = temp_wavs(n = 4), subdir_out = c("test1/a", "test2/a", "test3/c", "test4/d"), subsub_dir_out = rep("zz", 4), filename_out = c("wave1_clean.wav", "wave2_clean.wav", "wave3_clean.wav", "wave4_clean.wav"), clip_length = c(1, 1, 1, 2), start_time = c(1.2, 0.5, 1, 0) ) clip_wave(w, dir_out = "clean", col_subdir_out = c(subdir_out, subsub_dir_out)) unlink("clean", recursive = TRUE) # Remove this new 'clean' directory
Clip and copy a single wave files to a given length. See clip_wave()
for
processing multiple files.
clip_wave_single( path_in, path_out, clip_length, start_time = 0, wave_length = NULL, overwrite = FALSE )
clip_wave_single( path_in, path_out, clip_length, start_time = 0, wave_length = NULL, overwrite = FALSE )
path_in |
Character. Path to the wave file to clip. |
path_out |
Character. Path to copy the new clipped wave file to. |
clip_length |
Numeric. Length of new clip in seconds. |
start_time |
Numeric. Time in seconds where new clip should start. Default 0. |
wave_length |
Numeric. Length of the clipped wave file in seconds (if
|
overwrite |
Logical. Whether to overwrite existing files when creating
new clipped wave files. Default ( |
TRUE if successful
# Create test wave file f <- temp_wavs(1) # Clip file and check it out clip_wave_single(f, "new_file.wav", clip_length = 1) tuneR::readWave("new_file.wav") unlink("new_file.wav")
# Create test wave file f <- temp_wavs(1) # Clip file and check it out clip_wave_single(f, "new_file.wav", clip_length = 1) tuneR::readWave("new_file.wav") unlink("new_file.wav")
Helper function to explore the number of files in a directory, recursively.
count_files(project_dir, subset = NULL, subset_type = "keep")
count_files(project_dir, subset = NULL, subset_type = "keep")
project_dir |
Character. Directory where project files are stored. File paths will be used to extract information and must actually exist. |
subset |
Character. Text pattern to mark a subset of files/directories
to either |
subset_type |
Character. Either |
A data frame with number of files in a directory
count_files("PROJECT_DIR")
count_files("PROJECT_DIR")
Create a set of nested folders for storing ARU recordings by plots and sites.
create_dirs( plots, site_ids, base_dir = NULL, dir_list = FALSE, dry_run = TRUE, expect_dirs = FALSE )
create_dirs( plots, site_ids, base_dir = NULL, dir_list = FALSE, dry_run = TRUE, expect_dirs = FALSE )
plots |
Character vector. Hexagon or cluster names for folder names. |
site_ids |
Character vector. Site IDs. Should include the plot/cluster id in the name. |
base_dir |
Character. Base directory to build directory structure in. |
dir_list |
Logical. Whether to return a vector of directories (to be)
created (defaults to |
dry_run |
Logical. Whether to do a dry-run of the process (i.e. do not
actually create directories; defaults to |
expect_dirs |
Logical. Expect that directories may already exist? Default
( |
If dir_list = TRUE
, returns a list of directories (to be) created.
If not a dry run, also creates the folder structure.
# Default is to do a dry-run (don't actually create the directories) create_dirs( plots = c("river1", "river2", "river3"), site_ids = c( "river1_sm01", "river1_sm02", "river2_sm03", "river2_sm04", "river3_sm05", "river3_sm06" ), base_dir = "Recordings" ) # Get a list of directories which would be created create_dirs( plots = c("river1", "river2", "river3"), site_ids = c( "river1_sm01", "river1_sm02", "river2_sm03", "river2_sm04", "river3_sm05", "river3_sm06" ), base_dir = "Recordings", dir_list = TRUE ) # Create directories AND return a list of those created d <- create_dirs( plots = c("river1", "river2", "river3"), site_ids = c( "river1_sm01", "river1_sm02", "river2_sm03", "river2_sm04", "river3_sm05", "river3_sm06" ), base_dir = "Recordings", dir_list = TRUE, expect_dirs =TRUE, dry_run = FALSE ) d
# Default is to do a dry-run (don't actually create the directories) create_dirs( plots = c("river1", "river2", "river3"), site_ids = c( "river1_sm01", "river1_sm02", "river2_sm03", "river2_sm04", "river3_sm05", "river3_sm06" ), base_dir = "Recordings" ) # Get a list of directories which would be created create_dirs( plots = c("river1", "river2", "river3"), site_ids = c( "river1_sm01", "river1_sm02", "river2_sm03", "river2_sm04", "river3_sm05", "river3_sm06" ), base_dir = "Recordings", dir_list = TRUE ) # Create directories AND return a list of those created d <- create_dirs( plots = c("river1", "river2", "river3"), site_ids = c( "river1_sm01", "river1_sm02", "river2_sm03", "river2_sm04", "river3_sm05", "river3_sm06" ), base_dir = "Recordings", dir_list = TRUE, expect_dirs =TRUE, dry_run = FALSE ) d
Lookarounds allow you to position a regular expression to more specificity.
create_lookaround(pattern, lookaround_pattern, position, negate = FALSE)
create_lookaround(pattern, lookaround_pattern, position, negate = FALSE)
pattern |
String. Pattern that you wish to add a look around to |
lookaround_pattern |
String. Pattern that you wish to look for. |
position |
String. One of 'before', 'after', 'ahead', or 'behind'. Capitalization doesn't matter |
negate |
Logical. allows you to exclude cases where look around is detected. |
Returns a string that can be used as a regular expression
# Here is a string with three patterns of digits text <- "cars123ruin456cities789" # To extract the first one we can use this pattern stringr::str_extract(text, "\\d{3}") # or create_lookaround("\\d{3}", "cars", "before") |> stringr::str_extract(string=text) # To exclude the first one we can write create_lookaround("\\d{3}", "cars", "before", negate=TRUE) |> stringr::str_extract_all(string=text) # To extract the second one we can write create_lookaround("\\d{3}", "ruin", "before") |> stringr::str_extract(string=text) # or create_lookaround("\\d{3}", "cities", "after") |> stringr::str_extract(string=text)
# Here is a string with three patterns of digits text <- "cars123ruin456cities789" # To extract the first one we can use this pattern stringr::str_extract(text, "\\d{3}") # or create_lookaround("\\d{3}", "cars", "before") |> stringr::str_extract(string=text) # To exclude the first one we can write create_lookaround("\\d{3}", "cars", "before", negate=TRUE) |> stringr::str_extract_all(string=text) # To extract the second one we can write create_lookaround("\\d{3}", "ruin", "before") |> stringr::str_extract(string=text) # or create_lookaround("\\d{3}", "cities", "after") |> stringr::str_extract(string=text)
Helper functions to create regular expression patterns to match different metadata in file paths.
create_pattern_date( order = "ymd", sep = c("_", "-", ""), yr_digits = 4, look_ahead = "", look_behind = "" ) create_pattern_time( sep = c("_", "-", ":", ""), seconds = "yes", look_ahead = "", look_behind = "" ) create_pattern_dt_sep( sep = "T", optional = FALSE, look_ahead = "", look_behind = "" ) create_pattern_aru_id( arus = c("BARLT", "S\\d(A|U)", "SM\\d", "SMM", "SMA"), n_digits = c(4, 8), sep = c("_", "-", ""), prefix = "", suffix = "", look_ahead = "", look_behind = "" ) create_pattern_site_id( prefix = c("P", "Q"), p_digits = 2, sep = c("_", "-"), suffix = "", s_digits = 1, look_ahead = "", look_behind = "" ) create_pattern_tz_offset( direction_from_UTC = "West", n_digits_hrs = 2, n_digits_min = 2 ) test_pattern(test, pattern)
create_pattern_date( order = "ymd", sep = c("_", "-", ""), yr_digits = 4, look_ahead = "", look_behind = "" ) create_pattern_time( sep = c("_", "-", ":", ""), seconds = "yes", look_ahead = "", look_behind = "" ) create_pattern_dt_sep( sep = "T", optional = FALSE, look_ahead = "", look_behind = "" ) create_pattern_aru_id( arus = c("BARLT", "S\\d(A|U)", "SM\\d", "SMM", "SMA"), n_digits = c(4, 8), sep = c("_", "-", ""), prefix = "", suffix = "", look_ahead = "", look_behind = "" ) create_pattern_site_id( prefix = c("P", "Q"), p_digits = 2, sep = c("_", "-"), suffix = "", s_digits = 1, look_ahead = "", look_behind = "" ) create_pattern_tz_offset( direction_from_UTC = "West", n_digits_hrs = 2, n_digits_min = 2 ) test_pattern(test, pattern)
order |
Character vector. Expected orders of (y)ear, (m)onth and (d)ate. Default is "ymd" for Year-Month-Date order. Can have more than one possible order. |
sep |
Character vector. Expected separator(s) between the pattern parts. Can be "" for no separator. |
yr_digits |
Numeric vector. Number of digits in Year, either 2 or 4. |
look_ahead |
Pattern to look ahead or after string Can be a regular expression or text. |
look_behind |
Pattern to look before behind string. Can be a regular expression or text. |
seconds |
Character. Whether seconds are included. Options are "yes", "no", "maybe". |
optional |
Logical. Whether the separator should be optional or not. Allows matching on different date/time patterns. |
arus |
Character vector. Pattern(s) identifying the ARU prefix (usually model specific). |
n_digits |
Numeric vector. Number of digits expected to follow the
|
prefix |
Character vector. Prefix(es) for site ids. |
suffix |
Character vector. Suffix(es) for site ids. |
p_digits |
Numeric vector. Number(s) of digits following the |
s_digits |
Numeric vector. Number(s) of digits following the |
direction_from_UTC |
Character. Must be on of "West", "East" or "Both" |
n_digits_hrs |
Numeric vector. Number(s) of digits for hours in offset. |
n_digits_min |
Numeric vector. Number(s) of digits for minutes in offset. |
test |
Character vector. Examples of text to test. |
pattern |
Character. Regular expression pattern to test. |
By default create_pattern_aru_id()
matches many common ARU patterns like
BARLT0000
, S4A0000
, SM40000
, SMM0000
, SMA0000
.
test_pattern()
is a helper function to see what a regular expression
pattern will pick out of some example text. Can be used to see if your
pattern grabs what you want. This is just a simple wrapper around
stringr::str_extract()
.
Either a pattern (create_pattern_xxx()
) or the text extracted by a
pattern (test_pattern()
)
create_pattern_date()
: Create a pattern to match a date
create_pattern_time()
: Create a pattern to match a time
create_pattern_dt_sep()
: Create a pattern to match a date/time separator
create_pattern_aru_id()
: Create a pattern to match an ARU id
create_pattern_site_id()
: Create a pattern to match a site id
create_pattern_tz_offset()
: Create a pattern to match a site id
test_pattern()
: Test patterns
create_pattern_date() # Default matches 2020-01-01 or 2020_01_01 or 20200101 # ("-", "_" or "" as separators) create_pattern_date(sep = "") # Matches only 20200101 (no separator allowed) create_pattern_time() # Default matches 23_59_59 (_, -, :, as optional separators) create_pattern_time(sep = "", seconds = "no") # Matches 2359 (no seconds no separators) create_pattern_dt_sep() # Default matches 'T' as a required separator create_pattern_dt_sep(optional = TRUE) # 'T' as an optional separator create_pattern_dt_sep(c("T", "_", "-")) # 'T', '_', or '-' as separators create_pattern_aru_id() create_pattern_aru_id(prefix = "CWS") create_pattern_aru_id(n_digits = 12) create_pattern_site_id() # Default matches P00-0 create_pattern_site_id( prefix = "site", p_digits = 3, sep = "", suffix = c("a", "b", "c"), s_digits = 0 ) # Matches site000a create_pattern_site_id() # Default matches P00-0 create_pattern_site_id( prefix = "site", p_digits = 3, sep = "", suffix = c("a", "b", "c"), s_digits = 0 ) # Matches site000a pat <- create_pattern_aru_id(prefix = "CWS") test_pattern("CWS_BARLT1012", pat) # No luck pat <- create_pattern_aru_id(prefix = "CWS_") test_pattern("CWS_BARLT1012", pat) # Ah ha! pat <- create_pattern_site_id() pat <- create_pattern_site_id() test_pattern("P03", pat) # Nope test_pattern("P03-1", pat) # Success! pat <- create_pattern_site_id(prefix = "site", p_digits = 3, sep = "", s_digits = 0) test_pattern("site111", pat) pat <- create_pattern_site_id( prefix = "site", p_digits = 3, sep = "", suffix = c("a", "b", "c"), s_digits = 0 ) test_pattern(c("site9", "site100a"), pat)
create_pattern_date() # Default matches 2020-01-01 or 2020_01_01 or 20200101 # ("-", "_" or "" as separators) create_pattern_date(sep = "") # Matches only 20200101 (no separator allowed) create_pattern_time() # Default matches 23_59_59 (_, -, :, as optional separators) create_pattern_time(sep = "", seconds = "no") # Matches 2359 (no seconds no separators) create_pattern_dt_sep() # Default matches 'T' as a required separator create_pattern_dt_sep(optional = TRUE) # 'T' as an optional separator create_pattern_dt_sep(c("T", "_", "-")) # 'T', '_', or '-' as separators create_pattern_aru_id() create_pattern_aru_id(prefix = "CWS") create_pattern_aru_id(n_digits = 12) create_pattern_site_id() # Default matches P00-0 create_pattern_site_id( prefix = "site", p_digits = 3, sep = "", suffix = c("a", "b", "c"), s_digits = 0 ) # Matches site000a create_pattern_site_id() # Default matches P00-0 create_pattern_site_id( prefix = "site", p_digits = 3, sep = "", suffix = c("a", "b", "c"), s_digits = 0 ) # Matches site000a pat <- create_pattern_aru_id(prefix = "CWS") test_pattern("CWS_BARLT1012", pat) # No luck pat <- create_pattern_aru_id(prefix = "CWS_") test_pattern("CWS_BARLT1012", pat) # Ah ha! pat <- create_pattern_site_id() pat <- create_pattern_site_id() test_pattern("P03", pat) # Nope test_pattern("P03-1", pat) # Success! pat <- create_pattern_site_id(prefix = "site", p_digits = 3, sep = "", s_digits = 0) test_pattern("site111", pat) pat <- create_pattern_site_id( prefix = "site", p_digits = 3, sep = "", suffix = c("a", "b", "c"), s_digits = 0 ) test_pattern(c("site9", "site100a"), pat)
A data frame with examples of correctly formatted metadata with added site-level information
example_clean
example_clean
example_clean
A data frame with 42 rows and 10 columns:
Name of the file
File type
Relative file path including file name
ARU model
ARU ids
Site ids
Recording date/time
Recording date
Latitude in decimal degrees
Longitude in decimal degrees
data-raw/data_test.R
A vector of examples ARU recording files.
example_files
example_files
example_files
A vector with 42 file paths
data-raw/data_test.R
A vector of examples ARU recording files. Uses the
example_sites
data, but deploys them for a longer deployment
example_files_long
example_files_long
example_files_long
A vector with 614 file paths
data-raw/data_long_deployment.R
A data frame with examples of incorrectly formatted site-level data.
example_sites
example_sites
example_sites
A data frame with 10 rows and 8 columns:
Site ids
Deployment start date
Deployment end date
ARU ids
Longitude in decimal degrees
Latitude in decimal degrees
Hypothetical extra plot column
Hypothetical extra subplot column
data-raw/data_test.R
A data frame with examples of correctly formatted site-level data.
example_sites_clean
example_sites_clean
example_sites_clean
A data frame with 10 rows and 8 columns:
Site ids
ARU ids
Deployment start date/time
Deployment end date/time
Deployment start date
Deployment end date
Latitude in decimal degrees
Longitude in decimal degrees
data-raw/data_test.R
Returns the current vector of ARU types
get_pattern(pattern_name)
get_pattern(pattern_name)
pattern_name |
String of pattern variable to return. One of "pattern_aru_type", "pattern_check","pattern_data", or "pattern_date_time" |
named character vector
get_pattern("pattern_aru_type")
get_pattern("pattern_aru_type")
Get the length of a recording in seconds
get_wav_length(path, return_numeric = FALSE)
get_wav_length(path, return_numeric = FALSE)
path |
Character. Path to wave file. |
return_numeric |
Logical. Return numeric or character? |
Length of recording in seconds
f <- tempfile() w <- tuneR::sine(440, duration = 100000) tuneR::writeWave(w, f) get_wav_length(f)
f <- tempfile() w <- tuneR::sine(440, duration = 100000) tuneR::writeWave(w, f) get_wav_length(f)
Try to guess the ARU type from a file path
guess_ARU_type(path)
guess_ARU_type(path)
path |
Character. Path to wave file |
Tibble with columns 'manufacturer', 'model', and 'aru_type'
get_pattern("pattern_aru_type") guess_ARU_type("/path/to/barlt/file.wav") guess_ARU_type("/path/to/sm/S4A2342.wav")
get_pattern("pattern_aru_type") guess_ARU_type("/path/to/barlt/file.wav") guess_ARU_type("/path/to/sm/S4A2342.wav")
clean_logs()
on the output from clean_metadata()
Run clean_logs()
on the output from clean_metadata()
meta_clean_logs(meta)
meta_clean_logs(meta)
meta |
Data frame. |
Data frame containing
file_name
s and path
s of the log files
event
s and their date_time
s
lat
and lon
for "gps" events
rec_file
, rec_size
and rec_end
for "recording" events
(recording start is the date_time
of the event)
schedule
information such as schedule_date
, schedule_name
,
schedule_lat
, schedule_lon
, schedule_sr
(sunrise),
and schedule_ss
(sunset)
meta
data information such as meta_serial
and meta_firmware
other columns from meta provided
file_vec <- fs::dir_ls(fs::path_package("extdata", package = "ARUtools"), recurse = TRUE,) m <- clean_metadata(project_files = file_vec, file_type = 'json',pattern_site_id = "000\\d+" ) logs <- meta_clean_logs(m)
file_vec <- fs::dir_ls(fs::path_package("extdata", package = "ARUtools"), recurse = TRUE,) m <- clean_metadata(project_files = file_vec, file_type = 'json',pattern_site_id = "000\\d+" ) logs <- meta_clean_logs(m)
Sample recordings based on selection weights from calc_selection_weights()
using spsurvey::grts()
.
sample_recordings( meta_weights, n, os = NULL, col_site_id = site_id, col_sel_weights = psel_std, seed = NULL, ... )
sample_recordings( meta_weights, n, os = NULL, col_site_id = site_id, col_sel_weights = psel_std, seed = NULL, ... )
meta_weights |
(Spatial) Data frame. Recording meta data selection
weights. Output of |
n |
Numeric, Data frame, Vector, or List. Number of base samples to
choose. For stratification by site, a named vector/list of samples per site, or
a data frame with columns |
os |
Numeric, Vector, or List. Over sample size (proportional) or named
vector/list of number of samples per site Ignored if |
col_site_id |
Column. Unquoted column containing site strata IDs
(defaults to |
col_sel_weights |
Column. Unquoted name of column identifying selection
weights (defaults to |
seed |
Numeric. Random seed to use for random sampling. Seed only
applies to specific sampling events (does not change seed in the
environment). |
... |
Extra named arguments passed on to |
A sampling run from grts. Note that the included dataset is spatial, but is a dummy spatial dataset created by using dates and times to create the spatial landscape.
s <- clean_site_index(example_sites_clean, name_date_time = c("date_time_start", "date_time_end") ) m <- clean_metadata(project_files = example_files) |> add_sites(s) |> calc_sun() params <- sim_selection_weights() w <- calc_selection_weights(m, params = params) # No stratification by site samples <- sample_recordings(w, n = 10, os = 0.1, col_site_id = NULL) # Stratification by site defined by... # lists samples <- sample_recordings(w, n = list(P01_1 = 2, P02_1 = 5, P03_1 = 2), os = 0.2) # vectors samples <- sample_recordings(w, n = c(P01_1 = 2, P02_1 = 5, P03_1 = 2), os = 0.2) # data frame samples <- sample_recordings( w, n = data.frame( site_id = c("P01_1", "P02_1", "P03_1"), n = c(2, 5, 2), n_os = c(0, 0, 1) ) )
s <- clean_site_index(example_sites_clean, name_date_time = c("date_time_start", "date_time_end") ) m <- clean_metadata(project_files = example_files) |> add_sites(s) |> calc_sun() params <- sim_selection_weights() w <- calc_selection_weights(m, params = params) # No stratification by site samples <- sample_recordings(w, n = 10, os = 0.1, col_site_id = NULL) # Stratification by site defined by... # lists samples <- sample_recordings(w, n = list(P01_1 = 2, P02_1 = 5, P03_1 = 2), os = 0.2) # vectors samples <- sample_recordings(w, n = c(P01_1 = 2, P02_1 = 5, P03_1 = 2), os = 0.2) # data frame samples <- sample_recordings( w, n = data.frame( site_id = c("P01_1", "P02_1", "P03_1"), n = c(2, 5, 2), n_os = c(0, 0, 1) ) )
Set pattern into ARUtools environment
set_pattern(pattern_name, pattern)
set_pattern(pattern_name, pattern)
pattern_name |
string of variable to set |
pattern |
Pattern to add into ARUtools environment |
og_pat <- get_pattern("pattern_date_time") set_pattern("pattern_date_time", create_pattern_date()) glue::glue("Default pattern: {og_pat}") glue::glue("Updated pattern: {get_pattern('pattern_date_time')}") set_pattern("pattern_date_time", og_pat)
og_pat <- get_pattern("pattern_date_time") set_pattern("pattern_date_time", create_pattern_date()) glue::glue("Default pattern: {og_pat}") glue::glue("Updated pattern: {get_pattern('pattern_date_time')}") set_pattern("pattern_date_time", og_pat)
This function creates and explores parameters for generating selections.
These parameters define the selection distribution of minutes (min
) around
the sun event (sunrise/sunset), as well as of days (day
).
sim_selection_weights( min_range = c(-70, 240), min_mean = 30, min_sd = 60, day_range = c(120, 201), day_mean = 161, day_sd = 20, offset = 0, return_log = TRUE, selection_fun = "norm", selection_var = "psel_normalized", return_params = TRUE, plot = TRUE )
sim_selection_weights( min_range = c(-70, 240), min_mean = 30, min_sd = 60, day_range = c(120, 201), day_mean = 161, day_sd = 20, offset = 0, return_log = TRUE, selection_fun = "norm", selection_var = "psel_normalized", return_params = TRUE, plot = TRUE )
min_range |
Numeric vector. Range of the sampling distribution of minutes around the sun event. |
min_mean |
Numeric. Mean of the sampling distribution of minutes to the sun event. |
min_sd |
Numeric. SD in minutes of the sampling distribution of minutes around the sun event. |
day_range |
Date/Datetime/Numeric vector. Range of sampling distribution of days. Can be Dates, Date-times, or DOY (day-of-year, 1-366). |
day_mean |
Date/Datetime/Numeric. Mean date of the sampling distribution of days. Can be Date, Date-time, or DOY (day-of-year, 1-366). |
day_sd |
Numeric. SD in days of the sampling distribution of days. |
offset |
Numeric. Offset to shift for time of day in minutes. |
return_log |
Logical. Log the density in the selection function? |
selection_fun |
Character. Selection function to use. Options are
|
selection_var |
Character. Selection variable to plot
(if |
return_params |
Logical. Return parameter list for use in calc_selection_weights()? |
plot |
Logical. Create plot of simulated selection weights? If
|
Returns either a list of selection parameters or a plot of simulated selection weights
params <- sim_selection_weights()
params <- sim_selection_weights()
Using the external program SoX
(the Swiss Army knife of sound processing
programs), create a spectrogram image file. Note that you must have SoX
installed to use this function. Spectrograms will be silently overwritten.
sox_spectro( path, dir_out = "Spectrograms", prepend = "spectro_", width = NULL, height = NULL, start = NULL, end = NULL, rate = "20k", dry_run = FALSE, quiet = FALSE, sox_file_path = NULL, skip_check = FALSE )
sox_spectro( path, dir_out = "Spectrograms", prepend = "spectro_", width = NULL, height = NULL, start = NULL, end = NULL, rate = "20k", dry_run = FALSE, quiet = FALSE, sox_file_path = NULL, skip_check = FALSE )
path |
Character. Path to wave file. |
dir_out |
Character. Output directory. |
prepend |
Character. Text to add to the start of the output file. Defaults to "spectro_". |
width |
Numeric. Width of the spectrogram image in pixels. |
height |
Numeric. Height of the spectrogram image in pixels. |
start |
Numeric/Character. Start the spectrogram at this time (seconds or HH:MM:SS format). |
end |
Numeric/Character. End time the spectrogram at this time (seconds or HH:MM:SS format). |
rate |
Numeric. Audio sampling rate to display (used by the |
dry_run |
Logical. If |
quiet |
Logical. Whether to suppress progress messages and other non-essential updates. |
sox_file_path |
Path to sox file if not installed at the system level, otherwise NULL. |
skip_check |
Logical. Should the function skip check to ensure SoX is installed. This may allow speed ups if running across large numbers of files. |
Most arguments are passed through to the seewave::sox()
command.
width and height correspond to the -x
and -y
options for the
spectrogram
effect.
start
and end
are used by the trim
effect
rate
is passed on to the rate
effect
Based on code from Sam Hache.
Does not return anything, but creates a spectrogram image in
dir_out
.
# Prep sample file w <- tuneR::sine(440, duration = 300000) td <- tempdir() temp_wave <- glue::glue("{td}/test_wave.wav") tuneR::writeWave(w, temp_wave) # Create spectrograms try({sox_spectro(temp_wave) sox_spectro(temp_wave, rate = NULL) sox_spectro(temp_wave, start = 2, end = 3) sox_spectro(temp_wave, start = "0:01", end = "0:04") sox_spectro(temp_wave, prepend = "") }) # Clean up unlink(temp_wave) unlink("Spectrograms", recursive = TRUE)
# Prep sample file w <- tuneR::sine(440, duration = 300000) td <- tempdir() temp_wave <- glue::glue("{td}/test_wave.wav") tuneR::writeWave(w, temp_wave) # Create spectrograms try({sox_spectro(temp_wave) sox_spectro(temp_wave, rate = NULL) sox_spectro(temp_wave, start = 2, end = 3) sox_spectro(temp_wave, start = "0:01", end = "0:04") sox_spectro(temp_wave, prepend = "") }) # Clean up unlink(temp_wave) unlink("Spectrograms", recursive = TRUE)
A data frame with tasks generated from example_clean
using
the wildRtrax::wt_make_aru_tasks() function. Allows updating of
tasks on WildTrax https://wildtrax.ca/.
task_template
task_template
task_template
A data frame with 14 rows and 13 columns:
Site location name
Date time of the recording
Method of interpretation (generally '1SPT')
Length of recording in seconds
Transcriber ID, to be filled in with function
Empty character for filling in WildTrax
Empty character for filling in WildTrax
Empty character for filling in WildTrax
Empty character for filling in WildTrax
Empty character for filling in WildTrax
Empty character for filling in WildTrax
data-raw/data_wt_assign_tasks.R
Creates a directory structure and example wave files in temp folders.
temp_wavs(n = 6)
temp_wavs(n = 6)
n |
Numeric. How many test files to create (up to six). D |
vector of paths to temporary wave files
temp_wavs(n=3)
temp_wavs(n=3)
A data frame showing example observers and their effort
template_observers
template_observers
template_observers
A data frame with 4 rows and 2 columns:
Interpreter name in Wildtrax system
Number of hours to assign to interpreter
data-raw/data_wt_assign_tasks.R
This function takes a vector of wave file names and returns a list of three vectors that can be provided to the wind detection software or written to files that the software can read. Details of the usable fork of the wind detection software can be found at https://github.com/dhope/WindNoiseDetection
wind_detection_pre_processing( wav_files, site_pattern, output_directory, write_to_file = FALSE, chunk_size = NULL )
wind_detection_pre_processing( wav_files, site_pattern, output_directory, write_to_file = FALSE, chunk_size = NULL )
wav_files |
Vector of path to wav files |
site_pattern |
Pattern to extract sites from file names |
output_directory |
Directory path to export files to |
write_to_file |
Logical Should the function write files to output_directory |
chunk_size |
Numeric If not NULL, sets number of files to include in each chunk |
List including filePath, filenames, and sites suitable for wind software.
wind_files <- wind_detection_pre_processing( wav_files = example_clean$path, output_directory = td, site_pattern = create_pattern_site_id( p_digits = c(2, 3), sep = "_", s_digits = c(1, 2) ), write_to_file = FALSE, chunk_size = NULL )
wind_files <- wind_detection_pre_processing( wav_files = example_clean$path, output_directory = td, site_pattern = create_pattern_site_id( p_digits = c(2, 3), sep = "_", s_digits = c(1, 2) ), write_to_file = FALSE, chunk_size = NULL )
This function takes output from the command line program and summarizes it. Details of the wind detection software can be found at https://github.com/dhope/WindNoiseDetection.
wind_detection_summarize_json(f)
wind_detection_summarize_json(f)
f |
filepath for json #' |
tibble of summarized data from json file
# example code example_json <- system.file("extdata", "P71-1__20210606T232500-0400_SS.json", package = "ARUtools" ) wind_summary <- wind_detection_summarize_json(example_json)
# example code example_json <- system.file("extdata", "P71-1__20210606T232500-0400_SS.json", package = "ARUtools" ) wind_summary <- wind_detection_summarize_json(example_json)
Assign tasks for interpretation on Wildtrax
wt_assign_tasks( wt_task_template_in, interp_hours, wt_task_output_file, interp_hours_column, random_seed = NULL )
wt_assign_tasks( wt_task_template_in, interp_hours, wt_task_output_file, interp_hours_column, random_seed = NULL )
wt_task_template_in |
Path to csv template downloaded from Wildtrax
platform https://wildtrax.ca listing all tasks. Alternatively,
can be a data.frame that is correctly formatted using
|
interp_hours |
Path to number of hours for each interpreter or a data.table. If a file, must be csv and must include
the columns "transcriber" and whatever the variable |
wt_task_output_file |
Path to csv of output file for uploading to Wildtrax. If left as NULL will not write file |
interp_hours_column |
LazyEval column name with hours for interpreters |
random_seed |
Integer. Random seed to select with. If left NULL will use timestamp |
Returns a list with a tibble of assigned tasks and a summary tibble.
task_output <- wt_assign_tasks( wt_task_template_in = task_template, wt_task_output_file = NULL, interp_hours = template_observers, interp_hours_column = hrs, random_seed = 65122 )
task_output <- wt_assign_tasks( wt_task_template_in = task_template, wt_task_output_file = NULL, interp_hours = template_observers, interp_hours_column = hrs, random_seed = 65122 )