Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear performance suggestions #153

Open
spouliot opened this issue Dec 10, 2021 · 2 comments
Open

Linear performance suggestions #153

spouliot opened this issue Dec 10, 2021 · 2 comments
Labels
enhancement New feature or request

Comments

@spouliot
Copy link

When I tried Linear I noticed a performance/usability problem for my use case.

public Image Linear(double[] a, double[] b, bool? uchar = null);

I only wanted to multiply/add everything by a single values, not arrays. This is possible with the native API but not with the managed bindings.

The easy workaround is to create the arrays but the additional allocations can become costly...

using var scaled = image.Linear (new double [] { f }, new double [] { a });

Also while looking inside Linear implementation I noticed that a VOption instance is created, even if not needed for the call. That's another non-required memory allocation and it means some additional calls (e.g. in Call) gets executed when they are not needed.

I ended up doing

using var scaled = (Image) Operation.Call ("linear", null, image, f, a);

to avoid the issues but it's much less readable.

@kleisauke kleisauke added the enhancement New feature or request label Dec 11, 2021
@kleisauke
Copy link
Owner

You could also use the convenience arithmetic overloads that this binding provides, for example:

const double f = 10;
const double a = 20;

using var scaled = image * f;
using var added = scaled + a;

In terms of performance, it should run at the same speed as a single vips_linear call (I think, /cc @jcupitt), since everything is a delayed computation and pixels exist only on demand.

You're right about the VOption allocation (which is just a wrapper around Dictionary<string, object>). Before commit a2e8a83 image.Linear was implemented like this:

public Image Linear(double[] a, double[] b, bool? uchar = null)
{
    var options = new VOption();

    if (uchar.HasValue)
    {
        options.Add(nameof(uchar), uchar);
    }

    return this.Call("linear", options, a, b) as Image;
}

(i.e., allocate a dictionary and add the uchar parameter if set)

After that commit it was simplified to this:

public Image Linear(double[] a, double[] b, bool? uchar = null)
{
    var options = new VOption();

    options.AddIfPresent(nameof(uchar), uchar);

    return this.Call("linear", options, a, b) as Image;
}

Where options.AddIfPresent is implemented like this:

[MethodImpl(MethodImplOptions.AggressiveInlining)]
public void AddIfPresent<T>(string key, T? value) where T : struct
{
    if (value.HasValue)
    {
        _internalDictionary.Add(key, value);
    }
}

To avoid the VOption allocation, I could add an overload like this:

public Image Linear(double[] a, double[] b) => this.Call("linear", a, b) as Image;

(i.e., add overloads only for the required parameters)

But I don't know if that's worth it. Let me see if I can somehow measure that dictionary overhead.

@jcupitt
Copy link
Contributor

jcupitt commented Dec 11, 2021

Hi @kleisauke and @spouliot,

using var scaled = image * 10;
using var added = scaled + 10;

That'll actually turn into two Linear calls, which is a little unfortunate.

I doubt if the allocation overhead will be measurable --- most CPU time will be spent processing pixels and the API overhead isn't important, unless you are processing very small images (just a few pixels).

A great trick for this kind of thing is to operate on look up tables instead of images. Consider this to swap nibbles:

var image = ...
image = (image >> 4) | ((image << 4) & 0xf0);

That looks like it's going to be a lot of vips calls: two shifts, and andconst and an or.

You can always rewrite things like this as:

var original_image = ...
var image = Image.Identity();
image = (image >> 4) | ((image << 4) & 0xf0);
image = original_image.Maplut(image);

ie. make an identity function (a ramp lut), run your arithmetic on that, then use a single call to maplut to transform the image. Now it's just one vips call.

You can do this whenever you have a series of arithmetic operations on a single image, and the original is 8 or 16-bit. It'll work for any intermediate image type -- you could add a .Sin() there and it'd still work, it would just make a float lut.

nip2 does this kind of optimisation automatically. pyvips doesn't do it, but perhaps it should. It might even be possible to build it into libvips itself, though I've not tried.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants