BatlabJuliaUtils Documentation

BatlabJuliaUtils Documentation
Some Julia Notes
Defaults.jl
Plotting Helpers
Time-Frequency Helper Functions
Read Audio Data
Filters
Signal-to-Noise Ratio
Chirp Sequences
"Melody"
Optimization
Miscellaneous

Some Julia Notes

Keyword Arguments (`kwargs...` Notation)

If a function is as follows:

function helloworld(a; kwargs)

then you can pass in any keyword arguments you want, as follows:

helloworld(2; b=3, c="hello");

The use of these keyword arguments, when present, is described in the documentation.

Plotting

As Julia has some issues with plotting in Jupyter notebooks, it's recommended to use the Plotting Helpers listed here. If you use one of the built-in Julia plot functions, you can pass in the output of getplottingsettings as keyword arguments as follows:

plot(x_data, y_data; getplottingsettings("x label", "y label", "my title")...);

Otherwise, the plot will show up as tiny unless you pass in the keyword argument html_output_format=:png, and the axis labels may be cut off unless you pass in left_margin=10Plots.mm and bottom_margin=10Plots.mm.

Also, if some plotting function is not producing an output in a Jupyter notebook, make sure there is no semicolon at the end of the statement.

Broadcasting Across an Array

To broadcast a one number-to-one number function across an array, you can add a dot (.) after the function name.

For instance, if you have complex array

A = [3.0 + 4.0im, 5.0 + 2.3im, 3.4 + 0.9im];

you can get the magnitude of each element by abs.(A), which produces

1×3 Matrix{Float64}:
 5.0  5.50364  3.5171

`MethodError`s

If a function throws a MethodError: no method matching..., where the function wants you to pass in a Matrix but you passed in a Vector, you can apply vectortomatrix to the vector before passing it into the function. If you have the opposite problem, you can use matrixtovector.

If the function takes in a single number but you want to pass in an array instead, look at Broadcasting Across an Array.

If the datatype (e.g., Int, Real, Float64, etc.) of the input data is wrong, you can cast them to the correct type:

If A is an array of Ints, you can cast them to Float64 using Float64.(A), and cast them to Real using Real.(A).
If A is an array of Float64s, you can cast them to Ints using Int.(round.(A)) (you can replace round with floor or ceil, depending on how you want to round the numbers). If you omit the round, and A contains decimal numbers, then casting them to an Int will throw an InexactError.

-If A is an array of complex numbers but you know that the imaginary part should be zero (e.g., taking the inverse FFT of an FFT of a real array), you can do real.(A) to take the real part.

Defaults.jl

BatlabUtils/src/Defaults.jl stores default values for sampling frequencies (250 kHz for audio and 360 Hz for video), the speed of sound (344.69 m/s), default plot dimensions, and some algorithm parameters.

Plotting Helpers

BatlabJuliaUtils.getplottingsettings — Function

getplottingsettings(xlabel::String, ylabel::String, title::String;
    kwargs...) -> Dict{Symbol, Any}

Builds a dictionary of keyword arguments to pass into any plotting function, populated with the provided labels and titles, default font sizes and plot dimensions, and any additional keyword arguments passed into the function.

Inputs:

xlabel: label of the x-axis.
ylabel: label of the y-axis.
title: plot title.
kwargs...: you can pass in any other keyword arguments to specify plotting parameters or override the defaults set here.

Output:

kwargs_with_defaults: dictionary of plotting keyword arguments.

BatlabJuliaUtils.myplot — Function

myplot(args...; kwargs...)

Mimics Julia's plot function, except with defaults from getplottingsettings passed in. You can pass in any arguments you would to the regular plot function.

BatlabJuliaUtils.myplot! — Function

myplot!(args...; kwargs...)

Same as myplot, except adds to the last plot produced instead of making a new plot.

BatlabJuliaUtils.plotmicdata — Function

plotmicdata(idxs::AbstractArray{Int}, y::AbstractArray; plot_idxs=idxs,
    kwargs...)

Plot a time segment of audio data y, with milliseconds on the x-axis and voltage on the y-axis, and default plotting settings from getplottingsettings.

Inputs:

idxs: time indices of input audio data to plot.
y: audio data, where each column is one microphone.
plot_idxs (default: idxs): the x-axis labels of the plot will be audioindextoms.(plot_idxs). Use this parameter if the x-axis labels you want don't match the value you passed in for idxs.
kwargs...: you can pass in any additional keyword arguments to set plotting parameters.

plotmicdata(y::AbstractArray; plot_idxs=1:size(y, 1), kwargs...)

Plot audio data, with milliseconds on the x-axis and voltage on the y-axis, and default plotting settings from getplottingsettings. Plots the full length of y.

Inputs:

y: audio data, where each column is one microphone.
plot_idxs (default: 1, 2, 3,...): the x-axis labels of the plot will be audioindextoms.(plot_idxs).
kwargs...: you can pass in any additional keyword arguments to set plotting parameters.

BatlabJuliaUtils.plotmicdata! — Function

plotmicdata!(idxs::AbstractArray{Int}, y::AbstractArray; plot_idxs=idxs,
    kwargs...)

Same as plotmicdata(idxs, y), except adds to the last plot produced instead of making a new plot.

plotmicdata!(y::AbstractArray; plot_idxs=1:size(y, 1), kwargs...)

Same as plotmicdata(y), except adds to the last plot produced instead of making a new plot.

BatlabJuliaUtils.plotfftmag — Function

plotfftmag(idxs::AbstractArray{Int}, y::AbstractArray;
    fft_idxs=1:length(idxs), kwargs...)

Plot the Fourier transform magnitude of a time segment of real-valued time-domain audio data y.

Inputs:

idxs: time indices of input audio data to take the Fourier transform of.
y: time-domain audio data, where each column is one microphone.
fft_indices (default: plot all indices): indices of the Fourier transform to plot. Use this parameter if you only want to plot some frequencies.
kwargs...: you can pass in any additional keyword arguments to set plotting parameters.

plotfftmag(y::AbstractArray; fft_idxs=1:size(y, 1), kwargs...)

Plot the Fourier transform magnitude of real-value time-domain audio data y. Takes the Fourier transform of the full length of y.

Inputs:

y: time-domain audio data, where each column is one microphone.
fft_indices (default: plot all indices): indices of the Fourier transform to plot. Use this parameter if you only want to plot some frequencies.
kwargs...: you can pass in any additional keyword arguments to set plotting parameters.

Time-Frequency Helper Functions

ColWiseFFTs

BatlabJuliaUtils.colwisefft — Function

colwisefft(A::AbstractArray) -> Matrix{Complex}

Applies the Fast fourier Transform (FFT) to each column of input matrix A. A can also be a vector, in which case it is first transformed into a one-col matrix.

BatlabJuliaUtils.colwiseifft — Function

colwiseifft(A::AbstractArray) -> Matrix{Complex}

Applies the Inverse FFT to each column of input matrix A. A can also be a vector, in which case it is first transformed into a one-col matrix.

BatlabJuliaUtils.rowwisefft — Function

rowwisefft(A::Matrix) -> Matrix{Complex}

Applies the Fast fourier Transform (FFT) to each row of input matrix A.

BatlabJuliaUtils.rowwiseifft — Function

rowwiseifft(A::Matrix) -> Matrix{Complex}

Applies the Inverse FFT to each row of input matrix A.

Convert Data Indices to Time or Frequency

BatlabJuliaUtils.audioindextosec — Function

audioindextosec(index::Int; fs=250 kHz) -> Real

Given an index of the audio data, calculate the number of seconds since the beginning of the audio data.

Inputs:

index: index of audio data
fs: sampling frequency, in Hertz. Default set in Defaults.jl.

Output:

Seconds since the beginning of the audio data

BatlabJuliaUtils.audioindextoms — Function

audioindextoms(index::Int; fs=250 kHz) -> Real

Given an index of the audio data, calculate the number of milliseconds since the beginning of the audio data.

Inputs:

index: index of audio data
fs: sampling frequency, in Hertz. Default set in Defaults.jl.

Output:

Milliseconds since the beginning of the audio data

BatlabJuliaUtils.fftindextofrequency — Function

fftindextofrequency(index::Int, N_fft::Int; fs=250 kHz) -> Real

Given an index of an Discrete Fourier Transform taken on a segment of audio datal determine what frequency (in Hz) it corresponds to.

Inputs:

index: index of the Fourier Transform.
N_fft: length of the Fourier Transform, in samples.
fs: sampling frequency, in Hertz. Default set in Defaults.jl.

Output:

Frequency, in Hz, of the specified Fourier Transform index

BatlabJuliaUtils.getfftfrequencies — Function

getfftfrequencies(N_fft::Int; fs=250 kHz) -> Array{Real}

Return an array of length N_fft, where each element is the frequency, in Hz, of the corresponding index of a length-N_fft Fourier Transform of a segment of audio data.

Inputs:

N_fft: length of the Fourier Transform, in samples.
fs: sampling frequency, in Hertz. Default set in Defaults.jl.

BatlabJuliaUtils.videoindextosec — Function

videoindextosec(idx_video::Int, L_video::Int; fs_video=360) -> Real

Given an index of the video data, return the number of seconds since the start of the audio data.

The video and audio data are synchronized such that the end of the video data is simultaneous with the 8-second mark of the audio data.

Inputs:

idx_video: index of the video data.
L_video: length, in frames, of the video data.
fs_video: sampling rate of the video data, in Hertz. Default set in Defaults.jl.

BatlabJuliaUtils.sectovideoindex — Function

sectovideoindex(time_audio::Number, L_video::Int; fs_video=360) -> Int

Given a time (since the start of the audio data) in seconds, calculate the index of the closest video frame.

The video and audio data are synchronized such that the end of the video data is simultaneous with the 8-second mark of the audio data.

Inputs:

time_audio: time, in seconds, since the start of the audio data.
L_video: length, in frames, of the video data.
fs_video: sampling rate of the video data, in hertz. Default set in Defaults.jl.

BatlabJuliaUtils.getvideoslicefromtimes — Function

getvideoslicefromtimes(location_data::Matrix{Real}, t1::Real,
    t2::Real, fs_video=360) -> UnitRange{Int}, Matrix{Real}

Returns the video data corresponding to time interval [t1, t2] of the audio data, where t1 and t2 are in seconds.

The two datasets are synchronized as follows: the end of the video data coincides with the 8-second mark of the audio data.

Inputs:

location_data: full centroid data, where the columns represent coordinates (x, y, z).
t1: start time (inclusive) of interval, in seconds since the onset of the audio data.
t2: end time (inclusive) of interval, in seconds since the onset of the audio data.
fs_video (default set in Defaults.jl): sampling frequency of the centroid data, in Hertz.

Outputs:

slice_idxs: indices of the audio data corresponding to the [t1, t2] slice taken.
centroid_slice: centroid data in the interval [t1, t2].

BatlabJuliaUtils.getvideodataslicefromaudioindices — Function

getvideodataslicefromaudioindices(location_data::Matrix{Real},
    t1_idx::Real, t2_idx::Real, fs_video=360, FS=250k) 
                                    -> UnitRange{Int}, Matrix{Real}

Same as getvideoslicefromtimes, but takes audio data indices in lieu of times.

Inputs:

location_data: full centroid data, where the columns represent coordinates (x, y, z).
t1_idx: index of the audio data (inclusive) at which to start the centroid data slice.
t2_idx: index of the audio data (inclusive) at which to end the centroid data slice.
fs_video (default set in Defaults.jl): sampling frequency of the centroid data, in Hertz.
fs (default set in Defaults.jl): sampling frequency of the audio data, in Hertz.

Outputs: see getvideoslicefromtimes

Short-Time Fourier Transforms (STFTs) and Spectrograms

BatlabJuliaUtils.STFTwithdefaults — Function

STFTwithdefaults(y::AbstractArray, nfft=256,
    noverlap=Int(round(nfft/2)), window=hamming(nfft), zero_pad=true)
                                                    -> Matrix{Complex}

Wrapper around DSP.Periodograms.stft, which takes the short-time Fourier Transform (STFT) of a signal, with some reasonable default values set.

Inputs:

y: one-dimensional signal of which to take the STFT.
nfft (default: 256): length of each window of the STFT.
noverlap (default: nfft/2): overlap between adjacent STFT windows.
window (default: Hamming): function multiplicatively applied to each window to reduce spectral leakage.
zero_pad (default: true): if set to true, zero-pad the beginning and end of y with (nfft-1) zeros on either end.

Output:

Sy: STFT of y. Matrix with (nfft / 2 + 1) rows, each corresponding to a different frequency and N columns, where N is the number of time windows taken.

BatlabJuliaUtils.plotSTFTdb — Function

plotSTFTdb(Sy_db::Matrix; nfft=Int(2*floor(size(Sy_db, 1))), 
    noverlap=Int(round(nfft/2)), crange=50, fs=250 kHz,
    plotting_kwargs...)

Plots spectrogram Sy_db, where Sy_db is in decibels (If Sy is the STFT of signal y, then Sy_db is 20 times log10 of the magnitude squared of Sy).

Inputs:

Sy_db: STFT, in decibels.
nfft (default: 2*(height of Sy_db - 1)): length of each window of the STFT. Used for axis labeling.
noverlap (default: nfft/2): overlap between adjacent STFT windows. Used for axis labeling.
window (default: Hamming): function multiplicatively applied to each window to reduce spectral leakage.
crange (default: 50): the lowest value shown on the colorbar is the maximum value of Sy_db, minus crange. All elements of Sy_db that are smaller than this will show up in the spectrogram as this lowest value.
fs: sampling frequency of the audio data, in Hertz. Default set in Defaults.jl.
plotting_kwargs: extra keyword arguments for plotting (passed into the heatmap function).

BatlabJuliaUtils.plotSTFT — Function

plotSTFT(Sy::Matrix{Complex}; nfft=Int(2*floor(size(Sy_db, 1))), 
    noverlap=Int(round(nfft/2)), crange=50, fs=250 kHz,
    plotting_kwargs...)

Plots the spectrogram corresponding to STFT Sy (i.e., plots a heatmap of Sy_db = 20*log10(|Sy|^2)).

Inputs:

Sy: STFT, in decibels.
See plotSTFTdb for the rest of the arguments.

BatlabJuliaUtils.plotSTFTtime — Function

plotSTFTtime(y::AbstractArray; nfft=256, noverlap=Int(round(nfft/2)),
    window=hamming(nfft), zero_pad=true, crange=50, fs=250 kHz,
    plotting_kwargs...)

Takes the STFT of y and then plots the spectrogram using plotSTFT.

Inputs:

y: time-domain signal.
nfft, noverlap, window, zero_pad: see STFTwithdefaults.
crange, fs, plotting_kwargs: see plotSTFTdb

Read Audio Data

BatlabJuliaUtils.readmicdata — Function

readmicdata(mat_filename::String, n_channels=4) -> Matrix{Real}

Reads audio data from a MAT file into a matrix where each column represents a different microphone.

These can be produced by running matlab_utils/tdms_to_mat.m on a TDMS file with fields including Voltage_0, Voltage_1, etc.

Inputs:

mat_filename: name of a .mat file with variables Voltage_i, where microphone index i counts up from 0, each variable is a time-series array of voltages for the corresponding microphone, and all arrays are of the same length.
n_channels (default 4): number of microphones

Output:

y: N x K matrix, where N is the number of samples per microphone and K is the number of microphones. Each column corresponds to a different microphone.

Filters

BatlabJuliaUtils.movingaveragefilter — Function

movingaverage(x::AbstractArray, half_len::Int, stride=1)
                                                -> Matrix{Real}

Applies a symmetrical moving average filter of width 2*half_len + 1 to signal x. To make the output the same length as the input and avoid edge effects, the input is padded by duplicating the first and last elements each half_len times. The output is aligned with the input: i.e., if there is a large enough local maximum at index i of the input, there will also be a local maximum at the same index of the output.

x may have multiple columns, where each column represents a different set of time-series data. In that case, the filter is applied separately to each column.

Inputs:

x: input vector, or matrix where each column is a different dataset (i.e., channel).
half_len: determines the length of the filter, as described above.
stride (default: 1): if stride is not 1, then the output is downsampled by a factor of stride.

Output:

y: result of applying a moving average filter to x.

BatlabJuliaUtils.maxfilter — Function

maxfilter(x::AbstractArray, half_len::Int) -> Matrix{Real}

Applies a maximum filter to the input: y[n] = maximum(x[n-half_len:n+half_len]).

Input signal x may have multiple columns; each column represents a different set of time-series data.

Inputs:

x: input vector, or matrix where each column is a different dataset (i.e., channel).
half_len: determines the length of the filter, as described above.

Output:

y: output of the filter, as described above.

BatlabJuliaUtils.bandpassfilter — Function

bandpassfilter(x::AbstractArray, min_Hz::Number, max_Hz::Number;
    fs=250 kHz) -> AbstractArray{Real}

Applies an ideal bandpass filter with cutoffs min_Hz and max_Hz to input signal x.

Input signal x may have multiple columns; each column represents a different set of time-series data

Inputs:

x: input vector, or matrix where each column is a different dataset (i.e., channel).
min_Hz: lower cutoff of the filter.
max_Hz: upper cutoff of the filter.
fs: Sampling frequency, in Hertz. Default set in Defaults.jl.

Output:

y: x, with frequencies below min_Hz or above max_Hz zeroed out.

BatlabJuliaUtils.bandpassfilterFFT — Function

bandpassfilterFFT(x_fft::AbstractArray, min_Hz::Number, max_Hz::Number;
    fs=250 kHz) -> AbstractArray

Same as bandpassfilter, except the input and output are in the frequency domain.

Inputs:

x_fft: input vector (or matrix) in the frequency domain.
min_Hz: lower cutoff of the filter.
max_Hz: upper cutoff of the filter.
fs: Sampling frequency, in Hertz. Default set in Defaults.jl.

Output:

y_fft: x_fft, with frequencies below min_Hz or above max_Hz zeroed out.

BatlabJuliaUtils.bandpassfilterspecgram — Function

bandpassfilterspecgram(Sx::Matrix, min_Hz::Number, max_Hz::Number;
    nfft=2*(size(Sx, 1)-1), fs=FS) -> AbstractArray

Same as bandpassfilter, except the input and output are spectrograms.

Inputs:

Sx: input spectrogram.
min_Hz: lower cutoff of the filter.
max_Hz: upper cutoff of the filter.
nfft (default: 2*(height of Sx-1)): size of the window used to produce the spectrogram.
fs: Sampling frequency, in Hertz. Default set in Defaults.jl.

Output:

Sy: Sx, with frequencies below min_Hz or above max_Hz zeroed out.

BatlabJuliaUtils.circconv — Function

circconv(x::AbstractArray, h::AbstractArray; real_output=true) -> AbstractArray

Perform circular convolution in the frequency domain: y = iFFT(FFT(x) * FFT(h)).

Inputs:

x, h: two vectors of the same dimension, or matrtices where each column is a different channel of data.
real_output (default: true): whether to take the real component of the output before returning (false to leave the output a complex number).

Output:

circular convolution of x and y

BatlabJuliaUtils.deconvolve — Function

deconvolve(Y::AbstractArray, X::AbstractArray; fft_thresh=0.1) 
                                                -> AbstractArray

Perform deconvolution: i.e., given linear time-invariant system Y and input X, such that Y is X convolved with impulse response H, find H using Fourier transforms (H = iFFT(FFT(Y) ./ FFT(X))).

Optionally, zero out indices of FFT(Y) ./ FFT(X) where the magnitude of FFT(X) is above some threshold, fft_thresh.

X and Y must have the same dimensions. If they are matrices, deconvolution is applied separately to each column.

Inputs:

Y: system output; either a vector or a matrix where each column is a different data channel.
X: system input; same dimensions as Y.
fft_thresh (default: 0): described above.

Outputs:

H: estimated impulse response; same dimensions as X and Y.

Signal-to-Noise Ratio

BatlabJuliaUtils.getnoisesampleidxs — Function

getnoisesampleidxs(mic_data::AbstractArray; window_size=200)
                                                -> UnitRange{Int}

Given audio data, find the longest segment that is just noise.

First, the algorithm divides the data into segments of length window_size. The amplitude of a window is defined as the maximum deviation of mic_data from its mean (over the window). The maximum noise level is set at twice the minimum amplitude of any window. The algorithm finds the indices of the longest section of mic_data where no window has an amplitude above the noise level.

Inputs:

mic_data: matrix of audio data, where each column is a different channel.
window_size: length of the windows described above.

Outputs:

UnitRange (i.e., the datatype of the object 1:10) of the indices of the longest segment of the data that is only noise.

BatlabJuliaUtils.windowedenergy — Function

windowedenergy(x::AbstractArray, window_size::Int; window=hamming(nfft))
                                                        -> Matrix{Real}

Finds the energy over sliding windows of the signal x, where the windows have stride 1 (i.e., if the first window starts at index 1, the second window starts at index 2, etc.)

Inputs:

x: vector of data, or matrix where each column is a different channel of data.
window_size (default: 64): window length, in samples.
window (default: ones(window_size)): vector to multiply each window by. The default, ones, multiplies each element by 1. To emphasize the center of long windows, you can use windows like hamming(window_size) from the DSP package.

BatlabJuliaUtils.estimatesnr — Function

estimatesnr(y::AbstractArray, noise_sample::AbstractArray, window_size=256,
    window=ones(window_size)) -> Matrix{Real}

Crude estimate of the signal-to-noise ratio of the input signal y. The signal and noise levels are computed by taking the energy over sliding windows (with stride 1) of y and noise_sample, respectively. The SNR is estimated as the signal level of each window, divided by the mean noise level.

Inputs:

y: input signal, which may be a vector, or a matrix where each column is a different channel.
noise_sample: sample of noise, with the same number of channels as y.
window_size (default: 64): window length, in samples.
window (default: ones(window_size)): vector to multiply each window by. The default, ones, multiplies each element by 1. To emphasize the center of long windows, you can use windows like hamming(window_size) from the DSP package.

Output:

snr_est: estimated SNR, in decibels (log scale, times 20) of every timepoint in y.

Chirp Sequences

Terminology: Chirp Sequences

A chirp sequence is defined as all microphone outputs that result from a single bat vocalization.

This is a somewhat overloaded term, which can mean one of two things:

For a single microphone, a chirp and subsequent echos. We can call this a "single-mic chirp sequence".
The chirp and subsequent echos, but for all microphones that picked up the chirp. We can call this a "multi-mic chirp sequence".

BatlabJuliaUtils.estimatebuzzphase — Function

function estimatebuzzphase(snr::AbstractArray;
    buzz_phase_chirp_separation_ms=20) -> Matrix{Int}

Makes a very rudimentary estimate of chirp onsets, as heard by microphones, and then finds all times where chirp onsets are less than buzz_phase_chirp_separation_ms apart, for each microphone. It then prints out these time windows in descending order of length (these are the potential buzz phase times).

Note that can also classify some echos as chirp sequences, so expect there to be false positives.

Inputs:

snr: output of estimatesnr.
buzz_phase_chirp_separation_ms (default: 20): maximum number of milliseconds expected between onsets of consecutive buzz phase chirps.

BatlabJuliaUtils.ChirpSequence — Type

struct ChirpSequence
    start_idx::Int
    length::Int
    vocalization_time_ms::Real
    snr_data::Vector{Real}
    mic_data::Vector{Real}
    mic_num::Int
end

Datastructure to store individual chirp sequences (for a single microphone).

Fields:

start_idx: start of the chirp, in number of samples since the beginning of the audio data that this chirp sequence comes from (i.e., the MAT data that was initially read in).
length: length, in audio samples, of the chirp sequence.
vocalization_time_ms: time (since the beginning of the audio data) that the bat made the vocalization. Estimated using the centroid data.
snr_data: vector of estimated SNR values over the duration of the chirp sequence. Produced by estimatesnr.
mic_data: audio data for the given microphone, over the duration of the chirp sequence.
mic_num: which microphone (from 1 to 4) the data corresponds to.

BatlabJuliaUtils.getboundsfromboxes — Function

getboundsfromboxes(boxes::AbstractArray;
    filter_fn=(start_idx, stop_idx) -> true) -> Matrix{Int}

Given bitarray boxes (as a column vector), return a matrix where the first column is the start indices of the sections where boxes==1, and the second column is the end indices of those sections.

Optionally, only keep boxes where filter_fn returns true.

Inputs:

boxes: one-dimensional bitarray.
filter_fn (default: always return true): function that takes in the start and end indices of a region where boxes==1 and returns false for boxes to discard.

Output:

bounds: two-column matrix, as described above.

BatlabJuliaUtils.findhighsnrregions — Function

findhighsnrregions(snr::AbstractArray, signal_thresh::Number,
    min_peak_thresh::Number, maxfilter_length::Int;
    snr_drop_thresh=20, peak_snr_thresh_radius=1500)-> BitArray

Given the estimated SNR of the audio data (from the function estimatesnr), determine which regions are likely to contain chirp sequences, as follows:

Apply a maxfilter to the snr array: this helps us find contiguous regions with high SNR.
Find all sections where the maxfiltered SNR is above signal_thresh.
Of those sections, keep the ones where, at some point, the SNR goes above another, higher threshold.

The threshold in step 3 above is set as follows:

We look around in a range of peak_snr_thresh_radius around the "signal region" and we find the maximum SNR present in that range.
We require that the peak SNR of the "signal region" be at most snr_drop_thresh dB lower than that maximum value.
We also require that the peak SNR be no lower than min_peak_thresh.

Inputs:

snr: estimated SNR of the audio data, produced by estimatesnr, either the full matrix or one column.
signal_thresh, min_peak_thresh, snr_drop_thresh, peak_snr_thresh_radius: thresholds, descsribed above.
maxfilter_length: half-length, in samples, of the maximum filter.

Output:

high_snr_locations: bitarray that is 1 in regions that likely contain chirp sequences.

BatlabJuliaUtils.findhighsnrregionidxs — Function

findroughchirpsequencebounds(snr::AbstractArray, mic::Int,
    signal_thresh::Number, peak_thresh::Number, maxfilter_length::Int)
                                                        -> Matrix{Int}

For a single microphone/channel, converts the output of findhighsnrregions to a two-column matrix, where the first column is the start index of each presumed chirp sequence.

Inputs:

snr: estimated SNR of the audio data, produced by estimatesnr (full matrix).
signal_thresh, peak_thresh, snr_drop_thresh, peak_snr_thresh_radius: thresholds, described in findhighsnrregions.
maxfilter_length: half-length, in samples, of the maximum filter.

Output: described above.

BatlabJuliaUtils.adjusthighsnridxs — Function

adjusthighsnridxs(snr::AbstractArray, mic::Int,
    rough_bounds::Vector{Int}, max_end_idx::Int; tail_snr_thresh::Real,
    max_seq_len=MAX_SEQUENCE_LENGTH, maxfilter_seq_end=50) -> Vector{Int}

Given rough bounds for a single chirp sequence (one row of the output of findhighsnrregionidxs), adjust the end index to ensure that the chirp sequence isn't cut off early.

The process is similar to findhighsnrregions, but with more lenient thresholds.

Apply a maxfilter to the snr array, with a filter size that is ideally longer than the one used for findhighsnrregionidxs.
Find the first index of the maxfiltered SNRs that goes below tail_snr_thresh, a threshold lower than the one used for findhighsnrregionidxs, and sets the end index of the chirp sequence to this. If this index is beyond max_end_idx, then the end of the chirp sequence is set to max_end_idx.

Inputs:

snr: estimated SNR of the audio data, produced by estimatesnr (full matrix).
mic: microphone number, from 1 to 4.
rough_bounds: row of the matrix produced by findhighsnrregionidxs.
max_end_idx: the start of the next chirp sequence, or the end of the signal if this is the last chirp sequence for a particular microphone.
tail_snr_thresh: described above.
max_seq_len maximum length of the chirp sequence. Default set in Defaults.jl.
maxfilter_length (default: 50): half-length, in samples, of the maximum filter.

Output:

refined_bounds: two-element vector, where the first element is the start of the chirp sequence (unchanged), and the second element is the

BatlabJuliaUtils.getvocalizationtimems — Function

getvocalizationtimems(chirp_start_index::Int, mic::Int,
    location_data::Matrix{Real}, mic_positions::Matrix{Real}; 
    buffer_time_ms=100, fs_video=360,
    interp_type=QuadraticInterpolation) -> Real

Given the index of the audio data at which a chirp sequence starts, estimate the time that the bat made a vocalization, in milliseconds since the start of the audio data.

This is achieved by taking a slice of the centroid data of radius buffer_time_ms around the time the chirp reached the microphone, performing quadratic interpolation of the centroid data over that slice, and solving for t in

distance_from_mic(t) = speed_of_sound * (time_chirp_reached_mic - t)

to get the vocalization time.

This function returns NaN (not a number) if there isn't sufficient location data to determine the vocalization time.

Inputs:

chirp_start_index: index of the audio data that the chirp sequence started.
mic: microphone that heard the chirp sequence.
location_data: full centroid data, where the columns represent coordinates (x, y, z).
mic_positions: matrix of microphone positions, where each row is a different microphone and the columns represent coordinates (x, y, z).
buffer_time_ms (default: 100): radius, in milliseconds, of the slice of centroid data to examine.
fs_video (default set in Defaults.jl): sampling rate of the centroid data.
interp_type (default: QuadraticInterpolation): type of interpolation (from the package DataInterpolations).

BatlabJuliaUtils.groupchirpsequencesbystarttime — Function

groupchirpsequencesbystarttime(
    chirp_sequence_bounds_per_mic::AbstractArray{Matrix{Int}},
    snr::Matrix{Real}, y::Matrix{Real}, location_data::Matrix{Real},
    mic_locations::Matrix{Real}, single_chirp_snr_thresh=100,
    vocalization_start_tolerance_ms=1.5, any_mic_snr_thresh=45) 
                    -> Vector{Dict{Int, ChirpSequence}}, Vector{Real}

Given start and end indices of chirp sequences, for all microphones, determine which chirp sequences came from the same initial chirp. Only keep chirp sequences arising from vocalizations heard by at least two microphones.

Inputs:

chirp_sequence_bounds_per_mic: array of [chirp sequence indices for mic 1, ..., chirp sequence indices for mic 4]. "Chirp sequence indices for mic i" refers to a two-column matrix where the first column is the start indices of each chirp sequence and the second column is the corresponding end indices.
snr: estimated SNR of the audio data, produced by estimatesnr.
y: matrix of audio data, where each column is a different microphone.
location_data: full centroid data, where the columns represent coordinates (x, y, z).
mic_positions: matrix of microphone positions, where each row is a different microphone and the columns represent coordinates (x, y, z).
single_mic_snr_thresh (default: 100): if a chirp sequence only has data from one microphone, still store the chirp sequence if it has a SNR over this value.
any_mic_snr_thresh (default: 45): if none of the microphones have an SNR above this threshold, then we probably picked up an echo instead of a chirp, which should be disregarded.
vocalization_start_tolerance_ms (default: 1.5): if the estimated vocalization time for two chirp sequences (for different microphones) is within vocalization_start_tolerance_ms milliseconds, then they are considered to be from the same vocalization.

Output:

chirp_sequences: example form [{1 -> ChirpSequences(...), 2 -> ChirpSequence(...)}, {2-> ChirpSequence(...), 4 -> ChirpSequence(...), 3 -> ChirpSequence(...)}, ... ] Each element of the chirp_sequences vector corresponds to all chirp sequences arising from a single vocalization. This is represented by a dictionary mapping microphone number to ChirpSequence structure.
vocalization_times: vector, where each element is the estimated vocalization time for the corresponding chirp sequence.

BatlabJuliaUtils.plotchirpsequence — Function

function plotchirpsequence(chirp_seq::Dict{Int, ChirpSequence};
    plot_separate=false, plot_spectrogram=false, same_length=true,
    n_cols=true)

Plots a chirp sequence on one of three formats:

If plot_separate and plot_spectrogram are both false, it plots the data from all microphones in the same plot.
If plot_spectrogram is true, then it plots the spectrogram of the chirp sequence for each microphone (in separate plots).
Otherwise, if plot_separate is true, it plots the time-domain waveforms for each microphone (in separate plots).

Inputs:

chirp_seq_all_mics: dictionary mapping microphone number to a ChirpSequence object.
plot_separate, plot_spectrogram: descibed above.
same_length (default: true): zero-pad shorter chirp sequences at the end so that all chirp sequences are the same length. Only relevant for plot_separate or plot_spectrogram.
n_cols (default: 2): number of plots per row.

BatlabJuliaUtils.plotchirpsequenceboxes — Function

plotchirpsequenceboxes(start_ms::Real, stop_ms::Real,
    vocalization_times::Array, chirp_sequences::Array{Dict{Int, ChirpSequence}},
    y::Matrix{Real}, mics=1:size(y, 2))

Plots audio data (specified by matrix y) from start_ms to stop_ms milliseconds, with boxes around all chirp sequences in that interval. Estimated vocalization times are written above all boxes.

Inputs:

start_ms: start time of the plot, in milliseconds.
stop_ms: stop time of the plot, in milliseconds.
vocalization_times: list of vocalization times output by groupchirpsequencesbystarttime.
chirp_sequences: list of mappings from microphone to ChirpSequence object produced by groupchirpsequencesbystarttime.
y: matrix of microphone data, where each column is a different microphone.
mics (default: all): microphones for which to plot chirp sequences.

"Melody"

We define the "melody" of the vocalization as the fundamental harmonic of the chirp.

This section contains documentation for estimating the melody, as well as some basic methods for separating chirps from echos.

BatlabJuliaUtils.findmelody — Function

findmelody(chirp_seq_single_mic::ChirpSequence, peak_snr_thresh::Real;
    find_highest_snr_in_first_ms=1, nfft=256,
    bandpass_filter=(20_000, 100_000), maximum_melody_slope=5)
                                                        -> Vector{Int}

Given a chirp sequence (for a single microphone), estimate the "melody" (i.e., the fundamental harmonic) of the vocalization using the spectrogram.

The melody is traced as follows:

Find the time in the first find_highest_snr_in_first_ms milliseconds of the chirp with the highest SNR, and find the melody of the chirp at that point (here, the SNR is hopefully high enough to accurately estimate the melody).
Work backwards until the beginning of the chirp, at each index looking for the strongest frequency within some small range of the last frequency found. This range is determined by the parameter maximum_melody_slope.
Repeat, but this time work towards the end of the chirp. To avoid picking up echos, enforce that, once the slope of the melody (with respect to time) becomes negative, it can never become positive.

After tracing the melody, we need to see if we found the fundamental harmonic or some higher harmonic. This is done by taking the loudest part of the melody and dividing the frequency by 2, 3, etc. until we go below 20 kHz. The fundamental harmonic is the lowest such frequency with power at most melody_drop_thresh_db below the loudest harmonic.

The melody is computed in terms of Fourier transform indices. To find the melody in Hertz, use the fftindextofrequency function or use findmelodyhertz instead.

Inputs:

chirp_seq_single_mic: input chirp sequence object (for a single microphone).
peak_snr_thresh: threshold previously set for the peak SNR of a chirp sequence.
find_highest_snr_in_first_ms (default: 1): described in step 1 above.
nfft (default: 256): window size for computing the spectrogram.
bandpass_filter (default: (20_000, 100_000)): before computing the melody, a band-pass filter is applied to the spectrogram. bandpass_filter is a tuple of the (lower cutoff, upper cutoff), in Hertz.
maximum_melody_slope (default: 5): maximum amount, in Fourier transform indices, that that the melody is allowed to change from one index to the next.
melody_drop_thresh_db (default: 20): used to find the fundamental harmonic; described above.

Output:

melody: described above.

BatlabJuliaUtils.findmelodyhertz — Function

findmelodyhertz(chirp_seq_single_mic::ChirpSequence, peak_snr_thresh::Real;
    nfft=256, melody_keyword_arguments...) -> Vector{Real}

Same as findmelody, but converts the result to Hertz.

BatlabJuliaUtils.getharmonic — Function

getharmonic(chirp_seq_single_mic::ChirpSequence, melody::AbstractArray{Int},
    harmonic_num::Int; nfft=256, band_size=2) -> Vector{Int}

Given the output of findmelody for chirp_seq_single_mic, find the harmonic given by harmonic_num. This is done by searching for the strongest frequencies in a region of radius band_size around harmonic_num times the melody.

Inputs:

chirp_seq_single_mic: input chirp sequence object (for a single microphone).
melody: output of findmelody.
harmonic_num: which harmonic to find (e.g., 2, 3, etc.)
nfft (default: 256): window size for computing the spectrogram.
band_size (default: 2): described above.

Outputs:

harmonic: array that contains the estimated harmonic, in Fourier transform indices.

BatlabJuliaUtils.smoothmelody — Function

smoothmelody(melody::AbstractArray{Int}; filter_size=64)

Smooths a melody by produced by findmelody by applying a moving average filter.

Inputs:

melody: output of findmelody.
filter_size (default: 64): kernel size of the moving average filter.

Output:

melody, with a moving average filter applied, with each element rounded to the nearest integer.

BatlabJuliaUtils.estimatechirpbounds — Function

    estimatechirpbounds(chirp_seq_single_mic::ChirpSequence,
    melody::Vector{Int}, peak_snr_thresh::Real; nfft=256,
    bandpass_filter=(20_000, 100_000),melody_drop_thresh_db=20, melody_thresh_db_low=-20,moving_avg_size=10) -> Int

Given a chirp sequence object (from a single mic) and the melody estimated by findmelody, estimate the end index of the chirp (i.e., separate the chirp from the echos) as follows:

Find the point where the melody is the strongest.
Find the first index, after this point, where the melody strength drops over melody_drop_thresh_db decibels from its peak value (if this cutoff value is below melody_thresh_db_low, we instead find where the melody strength goes below melody_thresh_db_low).
- If the melody strength never drops below this threshold, then just return the last index of the chirp sequence.
Apply a moving average filter to the melody strength.
Apply the following heuristic:
- The end of the chirp is the first local minimum of the melody strength after the index from step 2, or the first time the melody strength dips below melody_thresh_db_low, whichever comes first.
- If neither event happens, return the last index of the chirp sequence.

Inputs:

chirp_seq_single_mic: input chirp sequence object (for a single microphone).
melody: result of findmelody.
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
use_second_harmonic_if_melody_starts_below (default: 35,000): if the beginning of the melody, in Hertz, is below this number, then the end probably goes below the range of the microphone. So, we want to check if using the second harmonic leads to a longer chip and, if so, use the second harmonic to estimate the chirp onset and offset.
nfft (default: 256): window size for computing the spectrogram.
bandpass_filter (default: (20_000, 100_000)): before computing the melody, a band-pass filter is applied to the spectrogram. bandpass_filter is a tuple of the (lower cutoff, upper cutoff), in Hertz.
melody_drop_thresh_db, melody_thresh_db_low (defaults: 20, -20): described in step 2 above.
melody_drop_thresh_db_start (default: 35): the start index of the chirp is computed as the first index where the melody strength is at most this amount lower than its maximum value.
moving_avg_size: radius of the moving average filter from step 4 above.

Output:

chirp_start_index: estimated start index of the chirp, in indices since the start of the chirp sequence.
chirp_end_index: estimated end index of the chirp, in indices since the start of the chirp sequence.

BatlabJuliaUtils.getchirpstartandendindices — Function

getchirpstartandendindices(
    chirp_sequence_all_mics::Dict{Int, ChirpSequence}, peak_snr_thresh::Real;
    chirp_kwargs...) -> Dict{Int, Int}, Dict{Int, Int}

Given a dictionary mapping microphones to ChirpSequence objects, run findmelody and estimatechirpbounds for each ChirpSequence and return two dictionaries mapping microphones to chirp start indices and chirp end indices, respectively. Indices are calculated in samples since the start of the chirp sequence (i.e., since the beginning of the mic_data array of the ChirpSequence object).

Inputs:

chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
chirp_kwargs: you can additionally pass in any keyword arguments for

findmelody and/or estimate_chirp_bounds.

Outputs:

chirp_starts: dictionary mapping microphone indices to their respective chirp start indices (in samples since the start of the chirp sequence).
chirp_ends: dictionary mapping microphone indices to their respective chirp end indices (in samples since the start of the chirp sequence).

BatlabJuliaUtils.plotmelody — Function

plotmelody(chirp_seq_single_mic::ChirpSequence, melody::Vector{Int},
    chirp_end=nothing; nfft=256, bandpass_filter=(20_000, 100_000),
    melody_color="blue", end_color="cyan")

Plots the melody estimated by findmelody, overlayed on the spectrogram of the chirp sequence. Optionally, plot a vertical line at the estimated end of the chirp sequence.

Inputs:

chirp_seq_single_mic: input chirp sequence object (for a single microphone).
melody: result of findmelody.
chirp_end (default: nothing): optionally, result of estimatechirpend. If chirp_end is nothing, then no vertical line is plotted.
nfft (default: 256): window size for computing the spectrogram.
bandpass_filter (default: (20_000, 100_000)): before computing the melody, a band-pass filter is applied to the spectrogram. bandpass_filter is a tuple of the (lower cutoff, upper cutoff), in Hertz.
melody_color (default: "blue"): color used to plot the melody.
end_color (default: "cyan"): color used to plot the vertical line at the end of the chirp sequence.

BatlabJuliaUtils.plotmelodydb — Function

plotmelodydb(chirp_seq_single_mic::ChirpSequence, melody::Vector{Int};
    nfft=256, bandpass_filter=(20_000, 100_000))

Plots the strength of the melody estimated by findmelody, in decibels.

Inputs:

chirp_seq_single_mic: input chirp sequence object (for a single microphone).
melody: result of findmelody.
nfft (default: 256): window size for computing the spectrogram.
bandpass_filter (default: (20_000, 100_000)): before computing the melody, a band-pass filter is applied to the spectrogram. bandpass_filter is a tuple of the (lower cutoff, upper cutoff), in Hertz.

BatlabJuliaUtils.estimatechirp — Function

estimatechirp(chirp_seq_single_mic::ChirpSequence, peak_snr_thresh::Real;
chirp_kwargs...) -> Int, Vector{Real}

Use estimatechirpend to separate the chirp from the echos (for a single ChirpSequence object) by returning the chirp sequence up until the estimated end index.

Inputs:

chirp_seq_single_mic: input chirp sequence object (for a single microphone).
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
chirp_kwargs: you can additionally pass in any keyword arguments for findmelody and/or estimate_chirp_bounds.

Output:

chirp_start: index of the chirp start, in samples since the beginning of the chirp sequence.
chirp_est: audio data of the estimated chirp.

BatlabJuliaUtils.plotestimatedchirps — Function

plotestimatedchirps(chirp_seq_all_mics::Dict{Int, ChirpSequence},
    peak_snr_thresh::Real; nfft=256, bandpass_filter=(20_000, 100_000),
    maximum_melody_slope=5, melody_drop_thresh_db=20,
    melody_thresh_db_low=-20, moving_avg_size=10)

Plots the result of estimatechirp, for every microphone present in chirp_seq_all_mics, by plotting the estimated chirp on top of the chirp sequence.

Inputs:

chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
same_length (default: true): whether to zero-pad the ends of chirp sequences so that all of them are the same length.
Rest of the arguments: see estimatechirp.

BatlabJuliaUtils.computemelodyoffsets — Function

computemelodyoffsets(chirp_sequence_all_mics::Dict{Int, ChirpSequence},
    peak_snr_thresh::Real; max_offset=500, max_negative_offset=100,
    tolerance=1, chirp_kwargs...)

Sometimes, especially for noisy data, the beginnings of the chirps get cut off. This function takes in data from all (high-snr) microphones for a chirp sequence and estimates how many samples were cut off from the beginning of each chirp.

It does this by:

Finding the mic with the highest SNR. This mic will be used as reference.
Finding the melody and chirp lengths for each microphone.
Shifting the chirps for the non-reference microphones forward and backward in time until we find the shift that minimizes the distance between that chirp's melody and the reference microphone's melody.

This ensures that we can "align" all of the chirps.

Inputs:

chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
max_offset (default: 500): maximum shift forward to try, in audio samples (i.e., if the beginning of the chirp was cut off).
max_negative_offset (default: 100): maximum shift backward to try, in audio samples (i.e., if there is some noise picked up before the beginning of the actual chirp). This should be relatively small to avoid erroneously cutting off the beginnings of chirps.
tolerance (default: 1): if the offset found in step 3 does not decrease the mean absolute error of the difference between the reference melody and the given microphone's melody by at least tolerance, then just set the offset to zero.
chirp_kwargs: you can additionally pass in any keyword arguments for findmelody and/or estimate_chirp_bounds.

Output:

offsets: dictionary mapping microphone number to the best shift, in samples of audio data, to the number of samples cut off from the beginning of that microphone's chirp sequence. If the number of samples is negative, this corresponds to extra noise at the beginning of the chirp sequence.

BatlabJuliaUtils.plotoffsetchirps — Function

function plotoffsetchirps(chirpseqall_mics::Dict{Int, ChirpSequence}, offsets::Dict{Int, Int})

Plot a chirp sequence, where the data from each microphone is shifted by the value found in computemelodyoffsets. Plots spectrograms in a vertical layout so that you can visially see if the chirps are aligned with each other.

Inputs:

chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
offsets: output of computemelodyoffsets.
start_idxs: dictionary mapping microphone indices to their respective chirp start indices (in samples since the start of the chirp sequence). You can get this using getchirpstartandendindices.

BatlabJuliaUtils.separatechirpkwargs — Function

separatechirpkwargs(;chirp_kwargs...) -> Dict{Symbol, Any},  
                                         Dict{Symbol, Any},
                                         Dict{Symbol, Any}

Given keyword arguments from a combination of findmelody, estimatechirpbounds, and computemelodyoffsets, separate them into three dictionaries: one for each of those functions.

Optimization

This section contains helper methods for performing blind deconvolution on a chirp sequence to estimate the initial bat vocalization.

BatlabJuliaUtils.colwisenormalize — Function

colwisenormalize(Y::Matrix) -> Matrix

Normalizes each column of matrix Y such that each column has an amplitude (i.e., maximum absolute value) of 1.

Inputs:

Y: matrix, where each column is a different channel of data.

Outputs:

Y, with each column divided by its amplitude.

BatlabJuliaUtils.getchirpsequenceY — Function

getchirpsequenceY(chirp_seq_all_mics::Dict{Int, ChirpSequence},
    offsets::Dict{Int, Int}, pad_len::Int; normalize=true)
                                    -> Matrix{Real}, Vector{Int}

Given chirp sequence objects (for all microphones) corresponding to the same initial vocalization, produce a matrix of microphone data, where each column is a different microphone. The microphone data is zero-padded at the beginning and the end by pad_len zeros. In addition, if the beginning of any chirp was cut off (as determined by offsets, which is the output of computemelodyoffsets), the beginning is zero-padded by the number of samples that were cut off. By default, each column of the output is normalized to have amplitude 1.

Inputs:

chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
offsets: output of computemelodyoffsets; mapping of microphone index to how many samples were cut off (if any) at the beginning of the chirp.
pad_len: how many zeros to add to the beginning and end of the mic data. This is used to mitigate edge effects from circular convolution.
normalize (default: true): whether to ensure each column of the output has amplitude 1.

Outputs:

Y: L x K matrix, where L is the length of the longest chirp sequene in chirp_seq_all_mics, plus 2*pad_len and K is the number of microphones used.
mics: list of microphone number, where the microphone number at each index is the microphone used for the corresponding column of Y.

getchirpsequenceY(chirp_seq_all_mics::Dict{Int, ChirpSequence},
    peak_snr_thresh::Real, pad_len::Int; normalize=true,
    offset_kwargs...) -> Matrix{Real}, Vector{Int}

Like the above function (getchirpsequenceY(chirp_seq_all_mics, offsets, pad_len, normalize)), except the offsets are automatically computed using computemelodyoffsets.

Inputs:

chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
pad_len: how many zeros to add to the beginning and end of the mic data. This is used to mitigate edge effects from circular convolution.
normalize (default: true): whether to ensure each column of the output has amplitude 1.
offset_kwargs: optionally, you can pass in any keyword arguments for computemelodyoffsets.

Outputs:

Y: L x K matrix, where L is the length of the longest chirp sequene in chirp_seq_all_mics, plus 2*pad_len and K is the number of microphones used.
mics: list of microphone number, where the microphone number at each index is the microphone used for the corresponding column of Y.

getchirpsequenceY(chirp_seq_all_mics::Dict{Int, ChirpSequence},
    pad_len::Int; normalize=true) -> Matrix{Real}, Vector{Int}

Like the above function (getchirpsequenceY(chirp_seq_all_mics, offsets, pad_len, normalize)), except the offsets are all set to zero.

Inputs

chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
pad_len: how many zeros to add to the beginning and end of the mic data. This is used to mitigate edge effects from circular convolution.
normalize (default: true): whether to ensure each column of the output has amplitude 1.

Outputs:

Y: L x K matrix, where L is the length of the longest chirp sequene in chirp_seq_all_mics, plus 2*pad_len and K is the number of microphones used.
mics: list of microphone number, where the microphone number at each index is the microphone used for the corresponding column of Y.

BatlabJuliaUtils.getinitialconditionsnr — Function

getinitialconditionsnr(Y::AbstractArray,
    chirp_seq_all_mics::Dict{Int, ChirpSequence}, mics::Vector{Int},
    peak_snr_thresh::Real; h_fft_thresh=0.1, nfft=256, chirp_kwargs...) 
                                -> Vector{Real}, Matrix{Real}, Int

Produces an initial condition for the blind deconvolution algorithm (i.e.: an estimate of the bat chirp and the impulse responses mapping the bat chirp to the audio output of each microphone). It uses the chirp estimate (from estimatechirp) for the highest-SNR microphone as the estimate of the bat chirp, and performs Fourier deconvolution (see the deconvolve function) to get the impulse responses.

This is the preferred method of obtaining the initial condition for noisy data.

Inputs:

Y: matrix of audio data, produced by getchirpsequenceY.
chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
mics: list of microphones used, produced by getchirpsequenceY.
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
h_fft_thresh (default: 0.1): the Fourier transform of H_init is set to zero at indices where the FFT of X_init is below this threshold.
nfft: window size of the STFT used to get the melody.
chirp_kwargs: you can pass in any keywords for estimatechirp.

Outputs:

X_init: estimated bat vocalization.
H_init: matrix of estimated impulse responses, the k-th column of H_init, convolved with X_init, produces the k-th column of Y.
longest_chirp: length of the longest estimated chirp.

BatlabJuliaUtils.getinitialconditionsparsity — Function

getinitialconditionsparsity(Y::Matrix{Real},
    chirp_seq_all_mics::Dict{Int, ChirpSequence}, mics::Vector{Int},
    peak_snr_thresh::Real; h_fft_thresh=0.1, nfft=256, chirp_kwargs...) 
                                -> Vector{Real}, Matrix{Real}, Int

Produces an initial condition for the blind deconvolution algorithm (i.e.: an estimate of the bat chirp and the impulse responses mapping the bat chirp to the audio output of each microphone). It uses the chirp estimate (from estimatechirp) for one of the microphones as the estimate of the bat chirp, and performs Fourier deconvolution (see the deconvolve function) to get the impulse responses. The chirp is estimated using whichever microphone produces the sparserst impulse response.

This is not recommended for noisy data; getinitialconditionsnr is better for that case.

Inputs:

Y: matrix of audio data, produced by getchirpsequenceY.
chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
mics: list of microphones used, produced by getchirpsequenceY.
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
h_fft_thresh (default: 0.1): the Fourier transform of H_init is set to zero at indices where the FFT of X_init is below this threshold.
nfft: window size of the STFT used to get the melody.
chirp_kwargs: you can pass in any keywords for estimatechirp.

Outputs:

X_init: estimated bat vocalization. nfft/2
H_init: matrix of estimated impulse responses, the k-th column of H_init, convolved with X_init, produces the k-th column of Y.
longest_chirp: length of the longest estimated chirp.

BatlabJuliaUtils.getmelodyregularization — Function

function getmelodyregularization(chirpseqallmics::Dict{Int, ChirpSequence}, N::Int, peaksnrthresh::Real; melodyradius = 2, nfft=256, stftstride=Int(floor(nfft/4)), maxfreqhz=100000, chirp_kwargs...) -> Matrix

Produces a weight matrix (of the same dimensions of the STFT of a bat vocalization, where the length of the vocalization is length of the longest estimated chirp for chirp_seq_all_mics) that quantifies how far any part of the spectrogram is from the melody or one of its harmonics.

Details of the algorithm are as follows:

Initialize Mx2, the element-wise square of the weight matrix, to be all infinity.
For each microphone in chirp_seq_all_mics: -Find the melody, chirp start/end indices, and whether the beginning of the chirp was cut off.
- Loop through all harmonics under max_freq_hz:
  - Find the distance of each point on the spectrogram to some range around the melody. The radius of this range is either, melody_radius, which is found by linearly interpolating melody_radius_start and melody_radius_end, or the slope of the melody at the given point, whichever is larger.
    This provides some leeway in where the actual melody falls, where the leeway is determined by how fast the melody is changing.
  - Set Mx2 to the element-wise minimum of its current value and the squared distance from step i.
Truncate Mx2 to the maximum estimated chirp length.
Downsample Mx2 by a factor of stft_stride, setting each index of the downsampled version to the minimum value in a time radius of nfft/2.

Inputs:

chirp_seq_all_mics: mapping of microphone index to ChirpSequence object.
N: upper bound on the length of the bat vocalization.
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
melody_radius_start, melody_radius_end (defaults: 6, 1): described in step 2.b.i above.
nfft (default: 256): window length to use for the STFT.
stft_stride (default: nfft/4): amount to downsample in step 4 above.
max_freq_hz (default: 100k): used to find the highest possible harmonic.
chirp_kwargs: optionally, you can pass in any keyward arguments for findmelody, estimatechirpbounds, computemelodyoffsets, or getmelodyregularization. See those functions for more detauls

Outputs:

Mx2: weight matrix, described above.
max_chirp_len: maximum chirp length from estimatechirp.

BatlabJuliaUtils.optimizePALM — Function

optimizePALM(chirp_seq::Dict{Int, ChirpSequence}, Y::Matrix{Real}, 
    H_init::Matrix{Real}, X_init::Array{Real}, peak_snr_thresh::Real,
    data_fitting_weight::Real, sparsity_weight::Real, melody_weight::Real;
    max_iter=1000, alpha=1e-3, gamma_H=1, gamma_X=1, num_debug=10, nfft=256,
    stft_stride=Int(floor(nfft/4)), chirp_kwargs...) -> Matrix, Matrix, Int

Performs blind deconvolution, formulated as an optimization problem with a combination of the following objectives:

Data-fitting: making sure X, convolved with any column of H, the matrix of impulse responses, is close to the corresponding column of Y, the microphone data marix produced by getchirpsequenceY.
Impulse response sparsity: encouraging sparsity of H, measured by the absolute sum (L1 norm) of the elements of H.
Melody following: making sure the energy of the spectrogram of X is close to the melody or one of its harmonics. See getmelodyregularization for more details.

This optimization problem is solved using PALM (proximal alternating linearized optimization: paper), following the general process outlined in this paper.

Inputs:

chirp_seq: mapping of microphone index to ChirpSequence object.
Y: matrix of microphone data; output of getchirpsequenceY.
H_init: initial condition for the impulse responses; output of getinitialconditionsnr or getinitialconditionsparsity.
X_init: initial condition for the bat vocalization. Produced by the same function as H_init.
peak_snr_thresh: threshold set for the peak SNR of a chirp sequence.
data_fitting_weight: weight of the data-fitting term in the optimization objective. Can be thought of as a percentage of how much that term matters (60 is a generally good value).
sparsity_weight: weight of the sparsity term in the objective. 10 is a generally good value.
melody_weight: weight of the melody-following term in the objective. 50 is a generally good value.
max_iter (default: 1000): number of iterations to run the algorithm for.
alpha (default: 0.001): the absolute sum of the elements of H is approximated using the smooth function |x| ≈ sqrt(x^2 + alpha^2) - alpha, where alpha is some small number
gamma_H (default: 1): factor to slow down updates of H (must be at least 1, or the algorithm might not converge).
gamma_X (default: 1): factor to slow down updates of X (must be at least 1, or the algorithm might not converge).
num_debug (default: 10): number of times to print debug statements over the course of the optimization algorithm.
nfft (default: 256): window length to use for the spectrogram of X.
stft_stride (default: nfft/4): spacing between adjacent spectrogram windows.
chirp_kwargs: optionally, you can pass in any keyward arguments for findmelody, estimatechirpbounds, or computemelodyoffsets.

Outputs:

X: bat vocalization, as estimated by the optimization algorithm,
H: matrix of impulse responses, as estimated by the optimization algorithm.
max_chirp_len: maximum chirp length from estimatechirp.

Miscellaneous

BatlabJuliaUtils.matrixtovector — Function

matrixtovector(mat::AbstractArray)

Given an input matrix, return the vector produced by stacking all of the columns together. If the input is already a vector, it remains unchanged.

Inputs:

mat: matrix to transform to a vector.

Output:

vec: e.g., if mat is [1 2 3; 4 5 6], vec will be [1, 4, 2, 5, 3, 6].

BatlabJuliaUtils.vectortomatrix — Function

vectortomatrix(vec::AbstractArray)

Given an input vector, return a matrix that consists of one column. If the input is already a matrix, it remains unchanged.

Inputs:

vec: vector to transform to a one-column matrix.

Output:

mat: single-column matrix.

BatlabJuliaUtils.randint — Function

randint(end_idx::Int) -> Int

Returns a random integer from 1 to end_idx, inclusive.

BatlabJuliaUtils.distancefrommic — Function

distancefrommic(location_data::Vector{Real},
    mic_positions::Matrix{Real}, mic_num::Int) -> Real

Given one frame of centroid data, compute the distance from mic mic_num.

Inputs:

location_data: one frame of centroid data, as a vector.
mic_positions: matrix of microphone positions, where each row is a different microphone and the columns represent coordinates (x, y, z).
mic_num: microphone for which to compute distances.

Output: distance from mic mic_num.

distancefrommic(location_data::Matrix{Real},
    mic_positions::Matrix{Real}, mic_num::Int) -> Vector{Real}

Given centroid data for multiple time points, compute the distance from mic mic_num (for each time point).

Inputs:

location_data: slice of centroid data, where the columns represent coordinates (x, y, z).
mic_positions: matrix of microphone positions, where each row is a different microphone and the columns represent coordinates (x, y, z).
mic_num: microphone for which to compute distances.

Output: vector of distance from mic mic_num, for each row of location_data.