Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goertzel DSP implementation #1

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

theloni-monk
Copy link

I think this mostly does what I want it to. Could probably use more testing.

Copy link
Owner

@WeirdConstructor WeirdConstructor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made some comments in the code for whoever picks it up. I just want to make sure the overall code quality of the DSP code in HexoDSP stays consistent.


pub fn reset(&mut self){
self.buff.clear();
self.ideal_buffsize = (2.0 * (1.0/self.srate) / self.reference_tuning).floor() as usize;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store sample rate not as 1.0/sample_rate, to save a division and be consistent with the other DSP implementations.

let mut Q1 = 0.0;
let mut Q2 = 0.0;

for i in 0..self.ideal_buffsize{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tick() currently iterates all samples for each sample in the current block. That is quite expensive. It should be explored, if there is a significant downside to executing the Goertzel algorithm only each M samples.
Maybe with a user configurable M as "latency" parameter?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also my comment below about the Gz3Filt.

Q2 = Q1;
Q1 = Q0;
}
let mag_squared = (Q1.powf(2.0) + (Q2.powf(2.0)) - (Q1*Q2*coeff)) as f32;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use powi()

let mut Q1 = 0.0;
let mut Q2 = 0.0;

for i in 0..self.ideal_buffsize{
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this loop to iterate over a slice of self.buff, that saves the overhead of checking unwrap() and reflects the intent better:

for s in &self.buff[0..self.ideal_buffsize] {
    Q0 = coeff * Q1 - Q2 + *s
}

pub srate: f32,
reference_tuning: f32, // assumed that target freqs will be int multiples of ref tuning
ideal_buffsize: usize, // calculated with respect to target freq and ref tuning
buff: VecDeque<f32>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VecDeque is unlikely in the current implementation to reallocate, which is fine.
But maybe should be replaced by a simpler implementation that writes into a fixed array similar to helpers::DelayBuffer - maybe even use the DelayBuffer in the first place. That makes accidental reallocations more unlikely.

|| (cgain - self.ogain).abs() > 0.0001
{
// recalculate coeffs of all in the cascade
self.computer.target_freq = cfreq;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set frequency with a set method on the Goertzel structure.

let (out_l, _) = run_for_ms(&mut node_exec, 25.0);
let rms_minmax = calc_rms_mimax_each_ms(&out_l[..], 10.0);
eprintln!("RMS: {:?}", rms_minmax);
assert!(rms_minmax[1].2 - rms_minmax[1].1 < 0.01); // the output should be const for const freq input
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Improve this test case and check the expected output with an absolute value additionally to this min/max difference check.


(matrix, node_exec)
}

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add test cases:

  • that check the Goertzel output against noise as input.
  • that change the Goertzel frequency from it's default and checks if the output signal of the Goertzel algorithm changes.

use crate::dsp::goertzel::*;

#[macro_export]
macro_rules! fa_goertzel_type { ($formatter: expr, $v: expr, $denorm_v: expr) => { {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this, fa_goertzel_type is not used anywhere.

@@ -0,0 +1,51 @@
// Copyright (c) 2021 Weird Constructor <[email protected]>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix Copyright to that of goertzel.rs

@WeirdConstructor
Copy link
Owner

WeirdConstructor commented Jul 13, 2022

I have some concerns regarding the performance of this filter. Currently it recalculates the output over the complete 100 or 200 samples in it's buffer for every input sample. This should be checked in HexoSynth and the DSP CPU should be observed for this filter (eg. by testing as plugin in a DAW or via qjackctl). And maybe a note about the performance should be made in the help text for this node.

This is part of my DSP node checklist for adding this to HexoDSP:

  • Implement boilerplate (done)
  • Document boilerplate (done)
  • DSP implementation (done, except necessary improvements)
  • Parameter fine tuning (test in HexoSynth, if the parameters feel right)
  • DSP tests for all params
  • Ensure Documentation is properly formatted for the GUI
  • Add CHANGELOG.md entry to HexoSynth
  • Add table entry in README.md in HexoSynth

@WeirdConstructor
Copy link
Owner

WeirdConstructor commented Jul 13, 2022

Regarding the performance, I could see a different kind of node implementation as valuable for using it in a sound synthesis patch:

  • Have a Gz3Filt, which provides 3 outputs for 3 diffrerent frequencies
  • With 3 input parameters to configure the frequencies that should be detected.
  • Have an input setting parameter that defines the static resolution (the implied output delay) of the filter. Have the user decide if they want the result after X=1, 2, 4, 10, ... milliseconds. And mention in the documentation the accuracy implications this has (the longer they can wait, the more accurate the detection).
  • The outputs should output the most recently detected value and change each X milliseconds
  • Apply a user configurable slew limiter to each output as convenience

This exploits the efficiency of the Goertzel algorithm better in context of a modular synthesizer in my opinion. Because you get a precise (depending on the output delay settings) very fine multi band bandpass filter.

@theloni-monk
Copy link
Author

Have an input setting parameter that defines the static resolution (the implied output delay) of the filter. Have the user decide if they want the result after X=1, 2, 4, 10, ... milliseconds. And mention in the documentation the accuracy implications this has (the longer they can wait, the more accurate the detection).
The outputs should output the most recently detected value and change each X milliseconds

I'm interested in this implementation, so I have some questions. Feel free to answer as many as would not be annoying. For this would it be too expensive to have that latency but window it such that the value still changes every frame, i.e. take overlapping windows and compute. Also for the input frequencies, would these be parameters or inputs? could they be both? I agree the GzFilt3 would be more interesting. Are CV signals restricted to (-1,1) and we would have to map that to a frequency range? or can they be whatever?

@WeirdConstructor
Copy link
Owner

I'm interested in this implementation, so I have some questions. Feel free to answer as many as would not be annoying. For this would it be too expensive to have that latency but window it such that the value still changes every frame, i.e. take overlapping windows and compute.

How would the window work? The performance hit comes from calculating Q0, Q1 and Q2 for the whole buffer again and again per input sample. You can of course amortize this by recalculating only each 5, 10 or 50 samples. But that is not as cheap as just accepting the longer delays and run goertzel Q0/Q1/Q2 calculations only once per new input sample.

I believe, correct me if I am wrong, that is also the whole point for Goertzel vs. FFT, as detecting only a few known frequencies does not require a full FFT anymore, but only running Q0/Q1/Q2 calculation once per input sample.

Also for the input frequencies, would these be parameters or inputs?

In HexoDSP there are two kinds of inputs for a node: Input parameters inp and atoms at. Input parameters get the iconic octagon knobs and accept sample accurate inputs from other nodes. Atoms are settings that are not automateable, they are settings that are manually changed by a user typically.

could they be both? I agree the GzFilt3 would be more interesting. Are CV signals restricted to (-1,1) and we would have to map that to a frequency range? or can they be whatever?

All signals in the DSP graph are (kind of) restricted to [-1,1]. "Kind of" because that is more a convention than a technical restriction. Even though most nodes clamp their inputs to that range.
The mapping between the DSP graph signal and more meaningful values that are used by the actual mathematic calculations is done by "denormalizing" the values. This is specified in the big parameter/input matrix in mod.rs and hidden in statements like these:

   let freq = inp::BOsc::freq(inputs);

@theloni-monk
Copy link
Author

theloni-monk commented Jul 13, 2022

But that is not as cheap as just accepting the longer delays and run goertzel Q0/Q1/Q2 calculations only once per new input sample. I believe, correct me if I am wrong, that is also the whole point for Goertzel vs. FFT, as detecting only a few known frequencies does not require a full FFT anymore, but only running Q0/Q1/Q2 calculation once per input sample.

Yes and no... so the more samples you take the thinner the bins of the equivalent fft become, but the more you are averaging over the sample. So computing a new Q0/Q1/Q2 for each new sample instead of each new buffer is eqivalent to finding the amplitude that a given frequency has had averaged over the entire life of the node. This is why you want to compute it for a window over the input stream, instead of the whole input stream. Currently, I am making a new window every sample, this is needlessly expensive however. So as you say, it would make sense to have a param for how often to calculate with respect to a new window.

@WeirdConstructor
Copy link
Owner

Yes and no... so the more samples you take the thinner the bins of the equivalent fft become, but the more you are averaging over the sample. So computing a new Q0/Q1/Q2 for each new sample instead of each new buffer is eqivalent to finding the amplitude that a given frequency has had averaged over the entire life of the node. This is why you want to compute it for a window over the input stream, instead of the whole input stream. Currently, I am making a new window every sample, this is needlessly expensive however. So as you say, it would make sense to have a param for how often to calculate with respect to a new window.

Yes, I would love to give the user that control of how big the window is, which means how accurate the frequency they pass in via parameter is detected. And I would see Gz3Filt not using overlapping windows (which requires saving input samples in a buffer, which is often used for visualizing a spectrum of frequencies), but by resetting Q0/Q1/Q2 to 0 (like described in the website you linked) and restarting for the next window. That would also get rid of any internal buffering.

The output would then be the most recently computed amplitude of the frequency.

@theloni-monk
Copy link
Author

Sounds good, I'll work on these over the next week or so when I have free time

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants