Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add NaN propagation and array initialization notes to ?GEMM docs #1098

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

TiborGY
Copy link

@TiborGY TiborGY commented Jan 19, 2025

Description
This PR fixes and closes #1077 and:

  1. Adds this description of NaN/Inf propagation quirks to the ?GEMM docs:
Note: if alpha and/or beta is zero, some parts of the matrix-matrix operations are not performed.
This results in the following NaN/Inf propagation quirks:

 1. If alpha is zero, NaNs or Infs in A or B do not affect the result.
 2. If both alpha and beta are zero, then a zero matrix is returned in C, irrespective of any NaNs or Infs 
 in A, B or C.
 3. If only beta is zero, alpha*op( A )*op( B ) is returned, irrespective of any NaNs or Infs in C.
  1. Adds notes like this to S/D GEMM about why they support complex conjugation:
Note: TRANSA = 'C' is supported for the sake of API consistency between all ?GEMM variants.
  1. Adds this to the descriptions of ALPHA arguments:
If ALPHA is zero the values in A and B do not affect the result.
This also means that NaN/Inf propagation from A and B is inhibited if ALPHA is zero.
  1. Changes descriptions of A arguments from array A must contain the matrix A. to:
array A must contain the matrix A, except if ALPHA is zero.
If ALPHA is zero, none of the values in A affect the result, even if they are NaN/Inf.
This also implies that if ALPHA is zero, the matrix elements of A need not be initialized by the caller.
  1. Changes descriptions of B arguments from array B must contain the matrix B. to:
array B must contain the matrix B, except if ALPHA is zero.
If ALPHA is zero, none of the values in B affect the result, even if they are NaN/Inf.
This also implies that if ALPHA is zero, the matrix elements of B need not be initialized by the caller.
  1. Changes descriptions of BETA arguments from
    When BETA is supplied as zero then C need not be set on input.
    to:
If BETA is zero the values in C do not affect the result. This also means that NaN/Inf propagation
from C is inhibited if BETA is zero.
  1. Changes descriptions of C arguments from
    array C must contain the matrix C, except when beta is zero, in which case C need not be set on entry.
    to:
array  C must contain the matrix  C, except if beta is zero. If beta is zero, none of the values in C 
affect the result, even if they are NaN/Inf. This also implies that if beta is zero, the matrix elements of 
C need not be initialized by the caller.

This documents/clarifies existing behaviour without adding anything to the docs that might hypothetically clash with the behaviour of some optimized BLAS implementation. (eg. I have decided against adding no-touch-guarantees to A, B and C, so a conforming implementation might still read data from them, as long as it does not affect the output)

Checklist

  • The documentation has been updated.
  • If the PR solves a specific issue, it is set to be closed on merge.

@ilayn
Copy link
Contributor

ilayn commented Jan 19, 2025

May I suggest a less if, else language such as

If alpha is set to zero, A and B are not referenced to avoid unnecessary math. A potential downside of this is that if the arrays contain NaN/Inf values they are not transferred to C.

This is more to the point and explains the mechanism instead of enumerating rules. I don't claim a command on English but this reads easier to me.

@TiborGY
Copy link
Author

TiborGY commented Jan 19, 2025

May I suggest a less if, else language such as

If alpha is set to zero, A and B are not referenced to avoid unnecessary math. A potential downside of this is that if the arrays contain NaN/Inf values they are not transferred to C.

This is more to the point and explains the mechanism instead of enumerating rules. I don't claim a command on English but this reads easier to me.

Thanks for the suggestion, I am of course open to refining the wording based on feedback.

Your version changes the meaning slightly, because "A and B are not referenced" would imply a no-touch-guarantee (no reads, no writes to any array element), which is something I do not want to include in the BLAS specification. Such a guarantee would permit A and B to not only contain arbitrary values, but also to not even be allocated.

Hypothetically, that could be untrue for some other BLAS implementations, eg. hardware vendor optimized BLAS libraries. I guess this could also happen with the NaN propagation quirks I am adding, since the spec has been silent on this until now, but I would rather not make this change any more invasive than it needs to be.

@TiborGY
Copy link
Author

TiborGY commented Jan 19, 2025

On a third reading, some of the conditions in the quirks list were redundant, so I have simplified that.

@ilayn
Copy link
Contributor

ilayn commented Jan 19, 2025

Your version changes the meaning slightly, because "A and B are not referenced" would imply a no-touch-guarantee (no reads, no writes to any array element), which is something I do not want to include in the BLAS specification.

That's a standard LAPACK/BLAS wording that can be found in many routine specifications. See for example vs parameter of ?gees to give an example that came to my mind first. The code does indeed not reference the arrays in case alpha is zero as you can see in the code.

@TiborGY
Copy link
Author

TiborGY commented Jan 19, 2025

Your version changes the meaning slightly, because "A and B are not referenced" would imply a no-touch-guarantee (no reads, no writes to any array element), which is something I do not want to include in the BLAS specification.

That's a standard LAPACK/BLAS wording that can be found in many routine specifications. See for example vs parameter of ?gees to give an example that came to my mind first. The code does indeed not reference the arrays in case alpha is zero as you can see in the code.

I see. The reference BLAS implementation does of course satisfy the stronger guarantee of not referencing the arrays, and I would actually be happy to demand it in the spec, but since this is not coordinated with other BLAS implementations, I minimized the amount that implementation freedom is restricted by the spec change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

?GEMM documentation is fuzzy on NaN handling when alpha or beta is zero
2 participants