Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need help: test fails on darwin x86-64 (but curiously not aarch64) #4

Open
donn opened this issue Apr 24, 2024 · 4 comments
Open

need help: test fails on darwin x86-64 (but curiously not aarch64) #4

donn opened this issue Apr 24, 2024 · 4 comments

Comments

@donn
Copy link

donn commented Apr 24, 2024

The test test_tch1dn/zch1dn fails on x86-64 versions of macOS, but not aarch64.

The residual errors are worse overall on the former, but it is only large enough to tip over into a failure for zch1dn.

Is this too serious of a problem? Is there a way for me to adjust the threshold?

x86-64 test log
Output:
----------------------------------------------------------

 testing Cholesky rank-1 downdate routines.
 All residual errors are expected to be small.

 sch1dn test:
      residual error =              0.572204589844E-05       PASS
 dch1dn test:
      residual error =              0.888178419700E-14       PASS
 cch1dn test:
      residual error =              0.953972266871E-05       PASS
 zch1dn test:
      residual error =              0.497379915032E-13       FAIL
----------------------------------------------------------------------
 total:     PASSED   3     FAILED   1
aarch64 test log
Output:
----------------------------------------------------------

 testing Cholesky rank-1 downdate routines.
 All residual errors are expected to be small.

 sch1dn test:
      residual error =              0.476837158203E-05       PASS
 dch1dn test:
      residual error =              0.106581410364E-13       PASS
 cch1dn test:
      residual error =              0.152587890625E-04       PASS
 zch1dn test:
      residual error =              0.284217094304E-13       PASS
----------------------------------------------------------------------
 total:     PASSED   4     FAILED   0
@grisuthedragon
Copy link
Member

Can you give some more details about the used BLAS library? Since the must be a reason. Nevertheless I seem to be safe to adjust the tolerance a bit.
Just change the the factor 2D2 in

if (rnrm < 2d2*dlamch('p')) then

to 1D3 ,

@doronbehar
Copy link

Hey, on Nixpkgs (Which also distributes software for both Darwin platforms) we experience exactly the same issue with x86_64-darwin (and not aarch64-darwin):

 1/13 Test  #1: test_tch1dn ......................***Failed   16.07 sec
 testing Cholesky rank-1 downdate routines.
 All residual errors are expected to be small.
 sch1dn test:
      residual error =              0.572204589844E-05       PASS
 dch1dn test:
      residual error =              0.106581410364E-13       PASS
 cch1dn test:
      residual error =              0.953972266871E-05       PASS
 zch1dn test:
      residual error =              0.497379915032E-13       FAIL
----------------------------------------------------------------------
 total:     PASSED   3     FAILED   1
Note: The following floating-point exceptions are signalling: IEEE_UNDERFLOW_FLAG IEEE_DENORMAL
STOP 1

I don't have available the floating points we had on aarch64-darwin unfortunately (because there the tests passed). Our blas and lapack implementations are both based on openblas version 0.3.27. The build log of it is available here (for x86_64-darwin):

https://cache.nixos.org/log/vdk8dns4jvy1n7w1djhdy7i1a3ph37p0-openblas-0.3.27.drv

I don't have personally an x86_64-darwin machine, so I am able to only use our CI which is very slow for these platforms unfortunately, so I won't be able to help much in debugging. I hope the debugging information I provided helps a bit.

@donn
Copy link
Author

donn commented Jul 21, 2024

It's the same blas version FWIW. I am using Nix to build qrupdate.

@grisuthedragon
Copy link
Member

It seems that the tolerances need to be adjusted a bit more. I'll prepare a patch during the next days. But the solution seems to adjust this line

if (rnrm < 2d2*dlamch('p')) then
again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants