Audio: Simplify volume ramp functions #6636

singalsu · 2022-11-18T18:35:04Z

This patch saves about 15% of processing time or about 0.9 MCPS
during volume ramp.

The volume_ramp() functions are inlined to volume_process() function. The
other use of function in idle is replaced by volume_ramp_check().

In ramp calculation the division in ramp time calculation is converted
to faster multiplication with inverse of sample rate.

The unnecessary checks for vol greater than vol_max or vol_min are
eliminated by ensuring that tvolume[] is limited to vol_min and vol_max.

The clear of ramp_coef[] at ramp finish is unnecessary. It's
internal for linear ramp and is not used by the other ramp type.

The other optimization is to pass to ramp math functions struct
vol_data pointer instead of processing_module pointer to direct
access to needed state variables.

Signed-off-by: Seppo Ingalsuo <[email protected]>

lgirdwood · 2022-11-21T12:43:09Z

@singalsu whats the MCPS improvement ?

singalsu · 2022-11-23T11:55:18Z

@singalsu whats the MCPS improvement ?

Currently tiny, only a fraction of MCPS from about +3 MCPS when ramp is executed. I wonder if the cache trash impact from this function to hot code part in volume shadows all improvement. Experiment with doubled update period 125 us -> 250 us could be worth it. I was maybe too critical when I tried to hear zipper effect i volume adjust. Opinion from a larger group could be useful.

src/audio/module_adapter/module/volume/volume.c

btian1 · 2022-11-25T02:54:56Z

For ramp, I reported a ramp disfunctional bug to sof kernel team, Rander is working on it:
thesofproject/linux#4026
I hard coded ramp in local and test performance with real time, it is around 0.9 for window ramp.

src/audio/module_adapter/module/volume/volume.c

src/include/sof/audio/volume.h

singalsu · 2022-11-25T17:28:43Z

@lgirdwood @btian1 Thanks for reviewing. Please check this new version also.

I took today another shot to this and got some better 0.5 MCPS saving during ramp. The linear ramp still adds 2.5 MCPS to normal volume load, almost double. Windows fade probably more, I haven't measured.

singalsu · 2022-11-25T17:36:34Z

For ramp, I reported a ramp disfunctional bug to sof kernel team, Rander is working on it: thesofproject/linux#4026 I hard coded ramp in local and test performance with real time, it is around 0.9 for window ramp.

If it was direct table lookup, it removes totally ramp customization and sample rate independence. I'm not changing features in this patch. It's another discussion.

What is 0.9, windows fade ramp MCPS delta to volume MCPS?

yes, windows ramp cost is 0.9 mcps, I tested.

btian1 · 2022-11-28T01:48:00Z

When 2.5 mcps happened, did you add ZC function? if added, then we can try to optimize it.
Also linear itself should not cause so big mcps cost.

src/include/sof/audio/volume.h

src/audio/module_adapter/module/volume/volume.c

singalsu · 2022-11-28T09:25:19Z

When 2.5 mcps happened, did you add ZC function? if added, then we can try to optimize it. Also linear itself should not cause so big mcps cost.

The 2.5 MCPS is with linear. The high cost is surprising and I think there must be some cache performance issue. I will test by inlining also the ramp math (remove function call pointer). If the optimize brings help, the rest to replace arithmetic with HiFi SIMD is simple.

singalsu · 2022-11-28T15:45:22Z

I in-lined the ramp math too, now the saving is about 0.9 MCPS. Below is a figure of two PGAs 1.2 and 30.67 from s32 playback to hda-generic headset. Upper pictures are before, lower are after. The higher part of the MCPS plot is 20 ms ramp, the lower part is the normal operation after ramp.

singalsu · 2022-11-28T16:20:53Z

I think the above per 1 ms trace computed MCPS is not very suitable for low loads. But maybe the observation of 1.5x load during ramp is more reliable. Below is the per 1024 ms output performance counters based MCPS (trace value x 400/38400). If average is 2.3 or 2.6 MCPS then during ramp the load would be 3.5 or 3.9 MCPS.

lyakh · 2022-11-29T13:15:31Z

src/audio/module_adapter/module/volume/volume.c

 */
-static void volume_ramp(struct processing_module *mod)
+static inline void volume_ramp(struct vol_data *cd)


...and inline shouldn't be needed here either, the compiler decides by itself when to inline static functions

The load increased by 0.4 MCPS when I removed the inline from volume_ramp(), volume_linear_ramp(), and volume_windows_fade_ramp(). Seems it's beneficial with xt-xcc version RG-2017.8-linux that I use for TGL.

hm, interesting, I'd literary expect the compiler to do that for us automatically. Since the gain isn't huge, to confirm that there's really a difference - could you perhaps just build the firmware with and without inline and check that the resulting image changed (in size)? At least volume_windows_fade_ramp() really shouldn't need inline since you take its address:

cd->ramp_func = &volume_windows_fade_ramp;

ok, so we dont have time to figure out why we get the XCC speed improvement with inline here, lets just mark this with an inline comment so that its obvious.

lyakh · 2022-11-29T13:20:25Z

src/audio/module_adapter/module/volume/volume.c

 	cd->channels = sink_c->stream.channels;
-	cd->sample_rate = sink_c->stream.rate;
+	cd->sample_rate_inv = (int32_t)((int64_t)1000 * INT32_MAX / sink_c->stream.rate);


I think 1000LL would have the same effect and looks prettier

btian1 · 2022-11-30T01:18:30Z

With this PR, can we say we lower down volume module 20% performance? perviously, I downed 5%, so if this PR can save 10%, left 5-10% can be saved by more advanced HIFI4 architecture.

singalsu · 2022-11-30T13:43:01Z

With this PR, can we say we lower down volume module 20% performance? perviously, I downed 5%, so if this PR can save 10%, left 5-10% can be saved by more advanced HIFI4 architecture.

Maybe more safe to say this reduces by 14-16% the worst-case. Maybe 15% for single number. There's some trace print overhead included so it could be more.

Also I tried as quick test to change the linear ramp multiplication to shift as fast approximation of multiply but the saving was minimal. So I didn't include any intrinsic multiplications. What might work additionally could be if we could change for selected components like volume the xcc optimization to -O3 and enable HiFi instructions for C. Now the optimization is -O2 for entire SOF.

lgirdwood · 2022-11-30T15:56:34Z

@singalsu looks like there are some build failures.

lgirdwood · 2022-12-02T16:41:19Z

@singalsu stiil some build failure

[ 70%] Building C object CMakeFiles/sof.dir/src/audio/module_adapter/module/volume/volume.c.o
[ 70%] Building C object CMakeFiles/sof.dir/src/audio/module_adapter/module/volume/volume.c.o
staging/sof-tplg: directory
staging/sof-tplg
|-- sof-adl-sdw-max98373-rt5682.tplg
|-- sof-adl-max98360a-nau8825.tplg
|-- sof-adl-max98360a-rt5682-4ch.tplg
|-- sof-adl-max98360a-rt5682.tplg
|-- sof-adl-rt1015-nau8825.tplg
|-- sof-adl-max98390-rt5682.tplg
|-- sof-tgl-sdw-max98373-rt5682.tplg
|-- sof-tgl-max98357a-rt5682-pdm1-drceq.tplg
|-- sof-adl-max98360a-rt5682-2way.tplg
├── ...
..
mkdir -p staging/tools
cd /home/sof/work/sof.git/installer/../installer-builds/build_tools && \
  cp -p ctl/sof-ctl  logger/sof-logger  probes/sof-probes /home/sof/work/sof.git/installer/staging/tools
[ 70%] Building C object CMakeFiles/sof.dir/src/audio/copier/copier.c.o
[ 70%] Building C object CMakeFiles/sof.dir/src/audio/copier/copier.c.o
/home/sof/work/sof.git/src/audio/module_adapter/module/volume/volume.c: In function 'volume_ramp':
/home/sof/work/sof.git/src/audio/module_adapter/module/volume/volume.c:267:10: error: variable 'ramp_time' set but not used [-Werror=unused-but-set-variable]
  267 |  int32_t ramp_time;
      |          ^~~~~~~~~
/home/sof/work/sof.git/src/audio/module_adapter/module/volume/volume.c: In function 'volume_ramp':
/home/sof/work/sof.git/src/audio/module_adapter/module/volume/volume.c:267:10: error: variable 'ramp_time' set but not used [-Werror=unused-but-set-variable]
  267 |  int32_t ramp_time;
      |          ^~~~~~~~~
cc1: all warnings being treated as errors
make[4]: *** [CMakeFiles/sof.dir/build.make:941: CMakeFiles/sof.dir/src/audio/module_adapter/module/volume/volume.c.o] Error 1
make[4]: *** Waiting for unfinished jobs....
cc1: all warnings being treated as errors
make[4]: *** [CMakeFiles/sof.dir/build.make:941: CMakeFiles/sof.dir/src/audio/module_adapter/module/volume/volume.c.o] Error 1
make[4]: *** Waiting for unfinished jobs....
Scanning dependencies of target bootloader_dump

lgirdwood · 2022-12-09T15:53:48Z

@singalsu can you check CI, thanks !

src/audio/module_adapter/module/volume/volume.c

lgirdwood · 2023-01-26T13:57:23Z

@singalsu still a failure - best to ping @wszypelt or @lrudyX

singalsu · 2023-01-26T18:11:51Z

@singalsu still a failure - best to ping @wszypelt or @lrudyX

Thanks, yes there's a gain test failure. I'll find out what exactly fails.

lgirdwood

LGTM, lets juts add the inline comment and we are good.

lgirdwood · 2023-03-06T11:21:04Z

src/audio/module_adapter/module/volume/volume.c

 */
-static void volume_ramp(struct processing_module *mod)
+static inline void volume_ramp(struct vol_data *cd)


ok, so we dont have time to figure out why we get the XCC speed improvement with inline here, lets just mark this with an inline comment so that its obvious.

This patch saves about 15% of processing time or about 0.9 MCPS during volume ramp. The volume_ramp() functions are inlined to volume_process() function. The other use of function in idle is replaced by volume_ramp_check(). In ramp calculation the division in ramp time calculation is converted to faster multiplication with inverse of sample rate. The unnecessary checks for vol greater than vol_max or vol_min are eliminated by ensuring that tvolume[] is limited to vol_min and vol_max. The clear of ramp_coef[] at ramp finish is unnecessary. It's internal for linear ramp and is not used by the other ramp type. The other optimization is to pass to ramp math functions struct vol_data pointer instead of processing_module pointer to direct access to needed state variables. Signed-off-by: Seppo Ingalsuo <[email protected]>

singalsu · 2023-03-07T13:45:14Z

LGTM, lets juts add the inline comment and we are good.

I just added the comment. Let's see if it passes now the CI.

btian1 · 2023-03-08T01:05:48Z

CI still failed, and title still have don't merge, please change title accordingly.

singalsu · 2023-03-08T08:17:37Z

CI still failed, and title still have don't merge, please change title accordingly.

The fail with TGLU_RVP_NOCODEC_IPC4ZPH seems to be a check-suspend-resume failure. I doubt it's related. The Internal Intel CI check that earlier failed has passed now.

lgirdwood · 2023-03-08T12:04:37Z

@andrula-song @kv2019i pls review

btian1 · 2023-03-09T01:04:23Z

yes, same failure on my PR as well.

kv2019i · 2023-03-09T11:07:20Z

One system PM failure in https://sof-ci.01.org/sofpr/PR6636/build4370/devicetest/index.html , not related to volume ramp. Proceeding with merge.

btian1 reviewed Nov 25, 2022

View reviewed changes

src/audio/module_adapter/module/volume/volume.c Show resolved Hide resolved

btian1 reviewed Nov 25, 2022

View reviewed changes

src/audio/module_adapter/module/volume/volume.c Show resolved Hide resolved

btian1 reviewed Nov 25, 2022

View reviewed changes

src/audio/module_adapter/module/volume/volume.c Outdated Show resolved Hide resolved

btian1 reviewed Nov 25, 2022

View reviewed changes

src/audio/module_adapter/module/volume/volume.c Outdated Show resolved Hide resolved

btian1 reviewed Nov 25, 2022

View reviewed changes

src/include/sof/audio/volume.h Outdated Show resolved Hide resolved

singalsu force-pushed the simplify_volume_ramp branch from 966c00b to d7eaaaa Compare November 25, 2022 17:10

singalsu changed the title ~~[DRAFT][Do not review] Audio: Simplify volume ramp functions~~ Audio: Simplify volume ramp functions Nov 25, 2022

btian1 reviewed Nov 28, 2022

View reviewed changes

src/include/sof/audio/volume.h Outdated Show resolved Hide resolved

lyakh reviewed Nov 28, 2022

View reviewed changes

src/audio/module_adapter/module/volume/volume.c Show resolved Hide resolved

singalsu force-pushed the simplify_volume_ramp branch from d7eaaaa to db79e50 Compare November 28, 2022 15:39

lyakh reviewed Nov 29, 2022

View reviewed changes

singalsu force-pushed the simplify_volume_ramp branch from db79e50 to 465fb1f Compare November 30, 2022 14:56

singalsu requested review from lyakh and btian1 November 30, 2022 15:00

singalsu marked this pull request as ready for review November 30, 2022 15:00

singalsu requested review from lgirdwood, plbossart, mmaka1 and lbetlej as code owners November 30, 2022 15:00

singalsu force-pushed the simplify_volume_ramp branch 2 times, most recently from 5eea43d to 96f7557 Compare December 2, 2022 16:11

singalsu force-pushed the simplify_volume_ramp branch from 96f7557 to db5ff50 Compare December 8, 2022 16:35

btian1 reviewed Dec 15, 2022

View reviewed changes

src/audio/module_adapter/module/volume/volume.c Show resolved Hide resolved

singalsu force-pushed the simplify_volume_ramp branch from db5ff50 to 7915400 Compare January 20, 2023 12:54

lgirdwood approved these changes Jan 20, 2023

View reviewed changes

singalsu changed the title ~~Audio: Simplify volume ramp functions~~ [Don't merge] Audio: Simplify volume ramp functions Jan 26, 2023

singalsu force-pushed the simplify_volume_ramp branch from 7915400 to e18b3e4 Compare February 15, 2023 15:02

lgirdwood reviewed Mar 6, 2023

View reviewed changes

singalsu force-pushed the simplify_volume_ramp branch from e18b3e4 to 956941f Compare March 7, 2023 13:43

singalsu changed the title ~~[Don't merge] Audio: Simplify volume ramp functions~~ Audio: Simplify volume ramp functions Mar 8, 2023

lgirdwood approved these changes Mar 8, 2023

View reviewed changes

kv2019i requested review from andrula-song and fkwasowi March 8, 2023 17:01

btian1 approved these changes Mar 9, 2023

View reviewed changes

andrula-song approved these changes Mar 9, 2023

View reviewed changes

kv2019i merged commit 84ecf46 into thesofproject:main Mar 9, 2023

singalsu deleted the simplify_volume_ramp branch March 9, 2023 14:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audio: Simplify volume ramp functions #6636

Audio: Simplify volume ramp functions #6636

singalsu commented Nov 18, 2022 •

edited

Loading

lgirdwood commented Nov 21, 2022

singalsu commented Nov 23, 2022

btian1 commented Nov 25, 2022

singalsu commented Nov 25, 2022

singalsu commented Nov 25, 2022 •

edited by btian1

Loading

btian1 commented Nov 28, 2022 •

edited

Loading

singalsu commented Nov 28, 2022 •

edited

Loading

singalsu commented Nov 28, 2022

singalsu commented Nov 28, 2022 •

edited

Loading

lyakh Nov 29, 2022

singalsu Nov 30, 2022

lyakh Dec 1, 2022

lgirdwood Mar 6, 2023

lyakh Nov 29, 2022

singalsu Nov 30, 2022

btian1 commented Nov 30, 2022

singalsu commented Nov 30, 2022 •

edited

Loading

lgirdwood commented Nov 30, 2022

lgirdwood commented Dec 2, 2022

lgirdwood commented Dec 9, 2022

lgirdwood commented Jan 26, 2023

singalsu commented Jan 26, 2023

lgirdwood left a comment

lgirdwood Mar 6, 2023

singalsu commented Mar 7, 2023

btian1 commented Mar 8, 2023

singalsu commented Mar 8, 2023 •

edited

Loading

lgirdwood commented Mar 8, 2023

btian1 commented Mar 9, 2023

kv2019i commented Mar 9, 2023

Audio: Simplify volume ramp functions #6636

Audio: Simplify volume ramp functions #6636

Conversation

singalsu commented Nov 18, 2022 • edited Loading

lgirdwood commented Nov 21, 2022

singalsu commented Nov 23, 2022

btian1 commented Nov 25, 2022

singalsu commented Nov 25, 2022

singalsu commented Nov 25, 2022 • edited by btian1 Loading

btian1 commented Nov 28, 2022 • edited Loading

singalsu commented Nov 28, 2022 • edited Loading

singalsu commented Nov 28, 2022

singalsu commented Nov 28, 2022 • edited Loading

lyakh Nov 29, 2022

Choose a reason for hiding this comment

singalsu Nov 30, 2022

Choose a reason for hiding this comment

lyakh Dec 1, 2022

Choose a reason for hiding this comment

lgirdwood Mar 6, 2023

Choose a reason for hiding this comment

lyakh Nov 29, 2022

Choose a reason for hiding this comment

singalsu Nov 30, 2022

Choose a reason for hiding this comment

btian1 commented Nov 30, 2022

singalsu commented Nov 30, 2022 • edited Loading

lgirdwood commented Nov 30, 2022

lgirdwood commented Dec 2, 2022

lgirdwood commented Dec 9, 2022

lgirdwood commented Jan 26, 2023

singalsu commented Jan 26, 2023

lgirdwood left a comment

Choose a reason for hiding this comment

lgirdwood Mar 6, 2023

Choose a reason for hiding this comment

singalsu commented Mar 7, 2023

btian1 commented Mar 8, 2023

singalsu commented Mar 8, 2023 • edited Loading

lgirdwood commented Mar 8, 2023

btian1 commented Mar 9, 2023

kv2019i commented Mar 9, 2023

singalsu commented Nov 18, 2022 •

edited

Loading

singalsu commented Nov 25, 2022 •

edited by btian1

Loading

btian1 commented Nov 28, 2022 •

edited

Loading

singalsu commented Nov 28, 2022 •

edited

Loading

singalsu commented Nov 28, 2022 •

edited

Loading

singalsu commented Nov 30, 2022 •

edited

Loading

singalsu commented Mar 8, 2023 •

edited

Loading