-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Goertzel DSP implementation #1
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made some comments in the code for whoever picks it up. I just want to make sure the overall code quality of the DSP code in HexoDSP stays consistent.
src/dsp/goertzel.rs
Outdated
|
||
pub fn reset(&mut self){ | ||
self.buff.clear(); | ||
self.ideal_buffsize = (2.0 * (1.0/self.srate) / self.reference_tuning).floor() as usize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Store sample rate not as 1.0/sample_rate, to save a division and be consistent with the other DSP implementations.
src/dsp/goertzel.rs
Outdated
let mut Q1 = 0.0; | ||
let mut Q2 = 0.0; | ||
|
||
for i in 0..self.ideal_buffsize{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tick() currently iterates all samples for each sample in the current block. That is quite expensive. It should be explored, if there is a significant downside to executing the Goertzel algorithm only each M samples.
Maybe with a user configurable M as "latency" parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also my comment below about the Gz3Filt.
src/dsp/goertzel.rs
Outdated
Q2 = Q1; | ||
Q1 = Q0; | ||
} | ||
let mag_squared = (Q1.powf(2.0) + (Q2.powf(2.0)) - (Q1*Q2*coeff)) as f32; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use powi()
src/dsp/goertzel.rs
Outdated
let mut Q1 = 0.0; | ||
let mut Q2 = 0.0; | ||
|
||
for i in 0..self.ideal_buffsize{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this loop to iterate over a slice of self.buff
, that saves the overhead of checking unwrap()
and reflects the intent better:
for s in &self.buff[0..self.ideal_buffsize] {
Q0 = coeff * Q1 - Q2 + *s
}
src/dsp/goertzel.rs
Outdated
pub srate: f32, | ||
reference_tuning: f32, // assumed that target freqs will be int multiples of ref tuning | ||
ideal_buffsize: usize, // calculated with respect to target freq and ref tuning | ||
buff: VecDeque<f32> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VecDeque is unlikely in the current implementation to reallocate, which is fine.
But maybe should be replaced by a simpler implementation that writes into a fixed array similar to helpers::DelayBuffer - maybe even use the DelayBuffer in the first place. That makes accidental reallocations more unlikely.
src/dsp/node_goertzel.rs
Outdated
|| (cgain - self.ogain).abs() > 0.0001 | ||
{ | ||
// recalculate coeffs of all in the cascade | ||
self.computer.target_freq = cfreq; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set frequency with a set method on the Goertzel structure.
let (out_l, _) = run_for_ms(&mut node_exec, 25.0); | ||
let rms_minmax = calc_rms_mimax_each_ms(&out_l[..], 10.0); | ||
eprintln!("RMS: {:?}", rms_minmax); | ||
assert!(rms_minmax[1].2 - rms_minmax[1].1 < 0.01); // the output should be const for const freq input |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve this test case and check the expected output with an absolute value additionally to this min/max difference check.
|
||
(matrix, node_exec) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test cases:
- that check the Goertzel output against noise as input.
- that change the Goertzel frequency from it's default and checks if the output signal of the Goertzel algorithm changes.
src/dsp/node_goertzel.rs
Outdated
use crate::dsp::goertzel::*; | ||
|
||
#[macro_export] | ||
macro_rules! fa_goertzel_type { ($formatter: expr, $v: expr, $denorm_v: expr) => { { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this, fa_goertzel_type is not used anywhere.
tests/node_goertzel.rs
Outdated
@@ -0,0 +1,51 @@ | |||
// Copyright (c) 2021 Weird Constructor <[email protected]> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix Copyright to that of goertzel.rs
I have some concerns regarding the performance of this filter. Currently it recalculates the output over the complete 100 or 200 samples in it's buffer for every input sample. This should be checked in HexoSynth and the DSP CPU should be observed for this filter (eg. by testing as plugin in a DAW or via qjackctl). And maybe a note about the performance should be made in the help text for this node. This is part of my DSP node checklist for adding this to HexoDSP:
|
Regarding the performance, I could see a different kind of node implementation as valuable for using it in a sound synthesis patch:
This exploits the efficiency of the Goertzel algorithm better in context of a modular synthesizer in my opinion. Because you get a precise (depending on the output delay settings) very fine multi band bandpass filter. |
I'm interested in this implementation, so I have some questions. Feel free to answer as many as would not be annoying. For this would it be too expensive to have that latency but window it such that the value still changes every frame, i.e. take overlapping windows and compute. Also for the input frequencies, would these be parameters or inputs? could they be both? I agree the GzFilt3 would be more interesting. Are CV signals restricted to (-1,1) and we would have to map that to a frequency range? or can they be whatever? |
How would the window work? The performance hit comes from calculating Q0, Q1 and Q2 for the whole buffer again and again per input sample. You can of course amortize this by recalculating only each 5, 10 or 50 samples. But that is not as cheap as just accepting the longer delays and run goertzel Q0/Q1/Q2 calculations only once per new input sample. I believe, correct me if I am wrong, that is also the whole point for Goertzel vs. FFT, as detecting only a few known frequencies does not require a full FFT anymore, but only running Q0/Q1/Q2 calculation once per input sample.
In HexoDSP there are two kinds of inputs for a node: Input parameters
All signals in the DSP graph are (kind of) restricted to [-1,1]. "Kind of" because that is more a convention than a technical restriction. Even though most nodes clamp their inputs to that range. let freq = inp::BOsc::freq(inputs); |
Yes and no... so the more samples you take the thinner the bins of the equivalent fft become, but the more you are averaging over the sample. So computing a new Q0/Q1/Q2 for each new sample instead of each new buffer is eqivalent to finding the amplitude that a given frequency has had averaged over the entire life of the node. This is why you want to compute it for a window over the input stream, instead of the whole input stream. Currently, I am making a new window every sample, this is needlessly expensive however. So as you say, it would make sense to have a param for how often to calculate with respect to a new window. |
Yes, I would love to give the user that control of how big the window is, which means how accurate the frequency they pass in via parameter is detected. And I would see Gz3Filt not using overlapping windows (which requires saving input samples in a buffer, which is often used for visualizing a spectrum of frequencies), but by resetting Q0/Q1/Q2 to 0 (like described in the website you linked) and restarting for the next window. That would also get rid of any internal buffering. The output would then be the most recently computed amplitude of the frequency. |
Sounds good, I'll work on these over the next week or so when I have free time |
I think this mostly does what I want it to. Could probably use more testing.